Ray Tune and Ray RLLIB

Hi ,
I have finally managed to run the PPOTrainer.train() without much issues ( Thanks to @mannyv ) .
However, i see that my agent is not learning when i look into the rewards.

  1. Should i first use Ray.tune for hyperparamter tuning and then use PPOTrainer.train() ?
  2. How can i display the results from train() better ?

Thank you in advance!

Always visualize your results to understand what your agent is doing.
A good starting point for the work you are facing is http://joschu.net/docs/nuts-and-bolts.pdf