Learning curves

Constantine_Bardis · March 9, 2022, 6:41pm

How severe does this issue affect your experience of using Ray?

None: Just asking a question out of curiosity
[x ] Low: It annoys or frustrates me for a moment.
Medium: It contributes to significant difficulty to complete my task but I work arounds and get it resolved.
High: It blocks me to complete my task.

Hello,

I would like to ask, how can I get the learning curve of the mean reward per episode per trial in a multi-agent hyperparameter search, like on the picture I attach (taken from the MADDPG paper)?

I only get scalar quantities at the end of each trial, while what I want is the mean reward across episodes for plotting reasons, and I am not sure at all how to approach that. Am I missing some obvious functionality? I am relatively new to Ray and Multi-Agent RL so excuse my questions if it is too noob-ish.

lr curve

matthewdeng · March 9, 2022, 11:08pm

Hi @Constantine_Bardis,

If you are using RLlib for this, then this information will be logged for you as a TensorBoard file!

During your Tune run, the logdir will be printed to STDOUT:

Result logdir: <path>

You can then visualize these metrics (along with others) by opening them with Tensorboard:

tensorboard --logdir <path>

Constantine_Bardis · March 12, 2022, 11:05am

Hello,

Yes I managed to run them on tensorboard once I downloaded the log files locally since I presume there is no easy support when running them in Kaggle notebooks…Many thanks for the prompt reply!

I would also like to ask, from the columns that are returned, there are some that are called “hist_stats/…”. What are these? I think I have them for each agent’s rewards accumulated

matthewdeng · March 13, 2022, 1:18am

Awesome! hist_stats contains some RLlib episode metrics. You can see how hist_stats is computed here.

Constantine_Bardis · March 15, 2022, 8:45am

Many thanks, I was actually curious as to how that hist stats were computed!

One final question, a little irrelevant: Do you have any tips on how to make QMIX and agent grouping work for PettingZoo? I had posed the code I wrote while trying to make it work but I never got an answer and haven’t figured it out yet…

matthewdeng · March 15, 2022, 4:33pm

Unfortunately I’m not familiar enough with RLlib to provide any substantial advice here. Can you post your question in the RLlib - Ray topic?

Topic		Replies	Views
Collecting metrics for different variation of the same experiment RLlib	7	228	January 7, 2023
Mean reward per agent in MARL RLlib	11	1107	January 12, 2023
Custom Tensorboard Metric (episode.total_reward auto generates as mean, min, max) RLlib	5	252	June 24, 2024
Tune log spam from RLlib trials Ray Tune	8	878	July 25, 2022
Visualization of learning returns progress Ray Tune	1	357	January 11, 2021

Learning curves

Related topics