RLlib callbacks to get custom metrics such as observation, reward...etc in each episode from SingleAgentEpisode and access it in the trainer

varnie · October 17, 2024, 8:10am

As part of the project I am working now I need to be able get access to intermediary values such as observations, reward, actions, action probability for each episode and access it in the result dictionary returned by the trainer class when we call .train().
Before SingleAgentEpisode in the new api stack, i.e. EpisodeV2, we were able to access these data like below:

def on_episode_end(self, *, episode: SingleAgentEpisode, **kwargs):
        actions = episode.actions.data
        observations = episode.observations.data
        rewards = episode.rewards.data
        episode.custom_metrics["observations"] = observations
        episode.custom_metrics["actions"]  = actions
        episode.custom_metrics["rewards"] = rewards

But SingleAgentEpisode no longer has the attribute custom metrics which means I can’t access these data in the result. Does anyone know how to access these in the new api stack?

hendrikunger · November 14, 2024, 8:55am

I have exactly the same problem. I found a partial solution using the MetricsLogger available in the new callbacks.
The problem now is that on_episode_start, on_episode_step and on_episode_end, where I need to collect my metrics, returns the MetricsLogger on the env_runner.

I need to do my calculations and preparations for the wandb upload in on_evaluate_end.
But in on_evaluate_end the MetricsLogger object of the algorithm class is returned.
If this would hold the objects I logged in the on_episode_step callback, everything would be fine. But the metrics seem to be reduced even if reduce=None is set while logging the metrics in on_episode_step.

So maybe this gives you a further clue or somebody with more knowledge can step in and explain how it is supposed to work.

hendrikunger · November 19, 2024, 9:35am

Solved the problem:

You can using the MetricsLogger in any of the callbacks

metrics_logger.log_value(("myData",infos[0].get('evalEnvID', 0)+1), infos, reduce=None, clear_on_reduce=True)

Then you can get back your data in the on_evaluate_end callback with

data = evaluation_metrics["env_runners"]["myData"]

.
I think it will be reduced unless you specifiy reduce=None during logging.

Topic		Replies	Views
How do I upload images and videos using WandbLoggerCallback RLlib	1	501	March 18, 2021
Plotting metrics at the end of episode on Wandb Ray Tune	6	689	December 8, 2021
Logging custom arrays with RLlib+Tune RLlib	2	831	June 16, 2021
Accessing custom metrics for episodes RLlib	3	853	March 19, 2024
Logging custom metrics by trial during PBT training RLlib	1	241	July 1, 2021

RLlib callbacks to get custom metrics such as observation, reward...etc in each episode from SingleAgentEpisode and access it in the trainer

Related topics