Meaning of episode_reward_mean

mannyv · October 18, 2021, 1:26pm

Hi @carlorop,
RLLIB collects a number of metrics. One of those is the episode_reward. When creating the summary it will compute and store the mean, min , and max of those metrics.
The metrics summarize all of the collected values during the previous iteration. What defines an iteration varies. If you are using PPO, for example, the episode reward will contain the rewards obtained for every completed episode that occurred when collecting the last train_batch_size steps. If no episodes returned done which is possible for some really long environments, then that metric would be empty.

Topic		Replies	Views
[RLlib, Tune, PPO] episode_reward_mean based on new episodes for each iteration Configure Algorithm, Training, Evaluation, Scaling	1	37	November 25, 2024
Custom Tensorboard Metric (episode.total_reward auto generates as mean, min, max) RLlib	5	287	June 24, 2024
How rllib train log the reward on tensorboard? RLlib	1	542	March 25, 2022
How to obtain single episode reward? RLlib	6	1471	March 19, 2024
Unable to get 'episode_reward_mean' RLlib	3	201	January 3, 2025

Meaning of episode_reward_mean

Related topics