How to obtain single episode reward?

Roller44 · May 10, 2022, 1:54pm

From here, I learned that the “episode_reward_mean” metric indicates the average of rewards collected in all previous episodes.

How can I obtain a single episode reward? Should I make a custom metric for it in the callback class?

lucasalavapena · May 11, 2022, 2:08pm

If you look at your progress.csv files in ~/ray_results you can see that it already saves the individual episode rewards in “hist_stat/episode_reward”, if you have the train_results (like ray/action_masking.py at a7d552ca2541376b87a40bc6b2189bab5a5c6c5a · ray-project/ray · GitHub) as a variable then you can access them just doing train_results[“hist_stats”][“episode_reward”]

Lars_Simon_Zehnder · May 11, 2022, 7:54pm

@Roller44 Also take a look at custom_metrics_and_callbacks.py to see how to access the metrics and create your own ones to track them in TensorBoard

Roller44 · May 12, 2022, 3:32am

It works! Thanks a lot!

A follow-up question: is there a way to access custom metrics without being averaged by those in previous episodes?

In order words, is there a way to access xxx instead of xxx_min, xxx_mean, and xxx_max?

Roller44 · May 12, 2022, 3:36am

That might not be a good idea, because I can only access the average of the custom metric values collected in all previous episodes.

Is there a way to access custom_metric instead of custom_metric_min, custom_metric_mean, and custom_metric_max?

Lars_Simon_Zehnder · May 12, 2022, 10:09am

It is a little unclear where you need to access the episode reward. Within the Episode object there should be the single rewards collected over the timesteps in Episode.hist_data.

If you need a tracking of these values subclassing DefaultCallbacks gives you in on_episode_end() access to the Episode instance. By this you can create own metrics with the single rewards for monitoring in TensorBoard and for tuning (also for post analysis).

If you need to access the values in a custom loop where you call trainer.train() the return value of the latter function as shown by @lucasalavapena is the best way to go.

Erica · March 19, 2024, 9:58am

For other people with this question that may end up here. I found the answer here: Accessing custom metrics for episodes - #2 by cool-RR

Topic		Replies	Views
Custom Tensorboard Metric (episode.total_reward auto generates as mean, min, max) RLlib	5	269	June 24, 2024
Mean reward per agent in MARL RLlib	11	1118	January 12, 2023
Logging custom metrics by trial during PBT training RLlib	1	241	July 1, 2021
Can ray allow access to individual episodes? RLlib	5	451	September 22, 2021
How rllib train log the reward on tensorboard? RLlib	1	537	March 25, 2022

How to obtain single episode reward?

Related topics