How can I get evaluation metrics from ExperimentAnalysis

Hi,
I can get episode_reward_[min/mean/max] from ExperimentAnalysis.trial_dataframes. However I cannot get evaluation/episode_reward_[min/mean/max] from ExperimentAnalysis.trial_dataframes when I enable evaluation. I do see the evaluation/episode_reward_[min/mean/max] in tensorboard. May I know if I should do extra configuration to make it work?
Thanks,
James

Below is what I get from ExperimentAnalysis.trial_dataframes. Unfortunately, there is no evaluation metrics in it.

Index(['episode_reward_max', 'episode_reward_min', 'episode_reward_mean',
       'episode_len_mean', 'episodes_this_iter', 'num_healthy_workers',
       'timesteps_total', 'timesteps_this_iter', 'agent_timesteps_total',
       'done', 'episodes_total', 'training_iteration', 'trial_id',
       'experiment_id', 'date', 'timestamp', 'time_this_iter_s',
       'time_total_s', 'pid', 'hostname', 'node_ip', 'time_since_restore',
       'timesteps_since_restore', 'iterations_since_restore',
       'hist_stats/episode_reward', 'hist_stats/episode_lengths',
       'sampler_perf/mean_raw_obs_processing_ms',
       'sampler_perf/mean_inference_ms',
       'sampler_perf/mean_action_processing_ms',
       'sampler_perf/mean_env_wait_ms', 'sampler_perf/mean_env_render_ms',
       'timers/sample_time_ms', 'timers/sample_throughput',
       'timers/load_time_ms', 'timers/load_throughput', 'timers/learn_time_ms',
       'timers/learn_throughput', 'info/num_steps_sampled',
       'info/num_agent_steps_sampled', 'info/num_steps_trained',
       'info/num_steps_trained_this_iter', 'info/num_agent_steps_trained',
       'perf/cpu_util_percent', 'perf/ram_util_percent',
       'perf/gpu_util_percent0', 'perf/vram_util_percent0',
       'info/learner/default_policy/learner_stats/cumulative_regret',
       'info/learner/default_policy/learner_stats/update_latency', 'trial'],
      dtype='object')

@gjoliver @sven1977 any ideas here?

right, evaluation results are only computed once every n steps. so depending on when your Tune session stops, that copy of result may not have evaluation result dict.

if you simply want the latest available eval results whenever you stop, you can set the bit always_attach_evaluation_results=True. It is a feature we introduced recently to buffer and attach the latest copy of eval results onto every single result dictionary before they are passed up to Tune.

We will update our documentation with this trick. And please use the latest release of RLlib to make sure you get the feature.

1 Like