Use Policy_Trainer with TensorBoard

Hi All,

I am using a policy client + server for training purposes, however, I can’t figure out how to have tensorboard display any information for the training runs? Is there a parameter I need to pass?

Moreover, can I have tensorboard statistics on the policy_client (even though this is a PPO model so all training happens on the server) for custom metrics (for clients I am mainly interested in specific reward sources to see what contributes to a reward over time to find trends in the AI’s decision making process).

Thanks in advance!

You can write your custom metrics in your custom callback class. See: RLlib Training APIs — Ray v1.8.0

I saw that, my question is, how do I actualy hook tensorboard to display my values? What folder do I specify from cli? And are other values logged automicaly (like avg episode length, loss etc) or I need to log them myself?

Normally logs are stored in ~/ray_results. You can type in the Tensorboard commands when you are in the ray_results folder.

Where’s the ~/ray_results folder located?

Hmm… Do you use Ray on Linux?

No, it’s currently on windows

Probably on C:\Users\yourusername\ray-results?

1 Like

Oh. I only know where the ray_results folder on Linux, so I cannot give an answer to your question. Sorry about that.

This looks correct?

Yes, it does ::slight_smile:

If you now call tensorboard --logdir C:\Users\yourusername\ray-results\PPO_RandomEnv_2021-11-11_07-39-05qxgyfj5g you should be able to see the results.

3 Likes

Yup that works! Awesome, thanks a bunch everyone!

2 Likes

One last question, how would I go about giving a specific name to the folders so they are more legible and not complete slew of characters?

I have never done it myself, as I consider the choice of the name as sufficient (Trainer used, date, time and hashcode). You can also just call

tensorboard --logdir C:\yourname\home\ray-results 

to get all logs loaded into Tensorboard:

You can check the ones you want to see.

In case you still want to change names, this might help you.

You can provide a name to tune.run that I think should affect the name in the log directory.

that’d would be very nice, is this as simple as adding a few lines to make tune.run(policy_trainer) ?

tune.run(policy_trainer, name="run_name")


    print(pretty_print(trainer.train()))
    print(f"Finished train run #{i + 1}")
    i += 1
    if i % 2 == 0:
        checkpoint = trainer.save(checkpoint_path)
        print("Last checkpoint", checkpoint)

That’s current my loop. Do I change print(pretty_print(trainer.train())) to tune.run(trainer, name=“run_name”)?

You could do for example,

tune.run(“PPO”, config=config, stop={#your stop criteria}, name=“run_name”)

Another option is that it should include the env name so you could register the env with different names that include extra info you want on each run.

I don’t really have a stop criteria though - I just want to make a save of the model every x iterations (each iteration takes like 1-4 minutes to train and like 35 mins to collect enough samples)…