How severe does this issue affect your experience of using Ray? Low: It annoys or frustrates me for a moment. I want to monitor the episode reward throughout training, rather than seeing min, max, mean plots. I used this callback and my tensorboard is updated with the custom metric, but it automa…

Hi @VisionZUS29 , The reason it reports a mean, min, max is because during one iteration it is potentially sampling transition s from multiple episodes. If multiple episodes complete then there will be multiple episode returns and the metrics report the min, max, mean of all the episodes. Hist stats…

@mannyv What do you mean by iteration, Is it sampling based on whether the config has collections as complete_episodes vs truncate_episodes? I guess then i am looking to be able to plot reward vs episode rather than steps? My env terminates after a set amount of steps so each episode has the same am…

Perhaps this thread will be helpful: [image] Meaning of episode_reward_mean RLlib What is the meaning of the episode_reward_mean metric? Is it the sum of the reward obtained in each time step of the episode? What is the difference between episode_reward_mean and episode_r…

@mannyv I get the gist of what you are saying. I understand the difference in mean, min, max. [image] Meaning of episode_reward_mean RLLIB collects a number of metrics. One of those is the episode_reward. When creating the summary it will compute and store the mean, min , and max of those metri…

Hi @VisionZUS29 , not quite sure if this is what you want, but i think you can set it in “episode.hist_data” like this: def on_episode_end(self, *, worker, base_env, policies, episode, **kwargs): # Log the total reward for the episode\ print("Episode Reward: ", episode.total…

Custom Tensorboard Metric (episode.total_reward auto generates as mean, min, max)

RLlib

VisionZUS29 June 12, 2024, 3:43am 5

@mannyv
I get the gist of what you are saying. I understand the difference in mean, min, max.

So why can’t I plot every episode reward for every completed episode that occurred throughout the training? If it’s already being collected and sorted per train_batch_size?

Topic		Replies	Views
How to obtain single episode reward? RLlib	6	1570	March 19, 2024
Add episode reward variance into matrix and tensorboard RLlib	4	594	February 15, 2022
How rllib train log the reward on tensorboard? RLlib	1	593	March 25, 2022
Mean reward per agent in MARL RLlib	11	1277	January 12, 2023
Looking for the tensorboard source code part RLlib	5	665	May 4, 2022

Custom Tensorboard Metric (episode.total_reward auto generates as mean, min, max)

Related topics