Accessing rllib evaluation in tune.Analysis

MaximeBouton · March 31, 2021, 2:56pm

When running multiple RL experiments with evaluation during training, rllib reports the evaluation metrics to tensorboard as “ray/tune/evaluation/…”.
How is it possible to access those metrics programmatically for analysis after training?

I looked into tune.Analysis to easily get statistics about multiple experiments, it works great but it has everything except the evaluation data

import ray.tune as tune 
EXPERIMENT_FOLDER = "/home/username/ray_results/my_experiments"
analysis = tune.Analysis(EXPERIMENT_FOLDER, default_metric="episode_reward_mean", default_mode="max")
df = analysis.dataframe()
for c in df.columns: print(c)

This will print:

episode_reward_max
episode_reward_min
episode_reward_mean
episode_len_mean
episodes_this_iter
... 
custom_metrics/... 
...
config/...

I am looking for a similar solution to get evaluation data. Thanks

Using Ray version 1.2.0.

sven1977 · March 31, 2021, 3:10pm

@MaximeBouton actually, I’m not sure. Could you ask this under Ray Tune?

The evaluation data gets returned under the “evaluation” top-level key within the metrics dict that an RLlib Trainer.train() returns.

@kai @amogkam @rliaw ?

MaximeBouton · March 31, 2021, 4:05pm

Thank you for the answer.
From my own investigation, I found that tune is using result.json to pull the data, and the “evaluation” key is not reported in this file.

(I changed the topic to ray tune)

MaximeBouton · March 31, 2021, 4:21pm

Could this has something to do with the fact that initially the dict returned by Trainer.train() does not have the key? If evaluation is done every n iterations?

LukasNothhelfer · June 17, 2021, 3:00pm

I would also be very interested in a solution to this. Does the problem still exist?

mannyv · June 17, 2021, 8:47pm

@MaximeBouton @LukasNothhelfer

Take a look at this message and the one below it for an explanation on the issue and how to fix it.

Topic		Replies	Views
Use `checkpoint_score_attr` with custom metric Ray Tune	3	507	May 11, 2022
Optimize a nested metric Ray Tune	5	388	February 26, 2021
Why do my tune runs have the same outputs across all iterations?	6	553	March 8, 2023
Accessing Tune Trials Intermediate Results by Iteration	1	348	July 25, 2023
Collecting metrics for different variation of the same experiment RLlib	7	226	January 7, 2023

Accessing rllib evaluation in tune.Analysis

Related topics