Use Policy_Trainer with TensorBoard

@Denys_Ashikhin,

What version of ray are you currently using?

There is a bug with rnn sequencing in the latest release.

You can avoid it with these settings (assuming you are not trying to train with multi-gpu).