Logging custom metrics by trial during PBT training

I’ve been surveying the Ray documentation and examples on its GitHub page for a while to find an answer to do this, but can’t seem to find a solution.

I’m training multiple trials(num_samples parameter in tune.run) of PPO with Tune’s PBT and would like to get tensorboard and csv log files of some simulator-related custom metrics made during training.
Let’s say I’m running 10 samples. I’d like to know the custom_metric values each sample outputs after every episode while training.

The issues I’m facing are:

  • There seems to be no method to access the sample indices such as 0~9 (since there are 10 samples), not the worker_index. Each sample uses multiple RolloutWorkers and I’d like to log the average of the custom metrics after each iteration.
  • I’ve read the DefaultCallbacks API and its example. This callback class is only “called at points during policy evaluation” as the documentation states, which is not what I want. I’m not using the evaluation option here.
  • However, in case I do use evaluation in near future, I would like to know how to access/mutate the episode (MultiAgentEpisode) object shown in the example. The example ray/custom_metrics_and_callbacks.py at master · ray-project/ray · GitHub uses episode.user_data, episode.hist_data, episode.custom_metrics, etc. How do I access/use these???

Thanks for any help.

Edit: if someone can answer even just the third bullet, that’d be nice. I have no idea how to put my custom metrics to the episode object in my simulator. (fyi, the custom simulator uses the ExternalEnv API).