I have recently (with a massive help from the community) got tensorboard logging working for the trainer process.
However, my reward is made up of multiple various sources which I keep track of inside the env (Policy_Client). I saw RLlib Training APIs — Ray v1.8.0 → however, from what I understood that uses more local worker processes.
I have multiple workers on different machines collecting samples at once, so I was wondering if there is a way to report this and have it show up on the trainer’s (Policy_Client) tensorboard. However, even if I can only get it on each individual worker, that is fine as well.
In particular, I am only interesting in reporting the various sources of the final reward at the end of the episode, but if that is not possible, per time step works as well although might look a bit funny.
Thank in advance!