Meaning of timers in RLlib PPO

Hi, I have trained a model using PPO and when I see results by iteration (the ones saved in ~/ray_results and that you can visiually represent with tensorboard) there are some metrics I don’t know exactly what they refer to. Is there any place where these metrics values are explianed? Specifically I’m interested in knowing what each timer represents:

  • timers/sample_time_ms,
  • timers/sample_throughput
  • timers/load_time_ms
  • timers/load_throughput
  • timers/learn_time_ms
  • timers/learn_throughput
  • timers/update_time_ms

Thanks in advance

2 Likes

Hi @javigm98 ,

I do not think the metrics are documented, but I had the same question so I hopefully can help finding the answer.

load_time_ms, learn_time_ms and update_time_ms are recorded in the method __call__(...) or ray.rllib.execution.train_ops.TrainTFMultiGPU with these and this timer objects.

If I remember correctly, lean_time_ms is the time to compute the gradients and perform one gradient descent step; load_time_ms is the time to load the sample on the device that will compute the gradients (GPU(s)); update_time_ms is the time to send the new weights of the network to each worker before starting the next iteration.

I don’t remember where sample_time_ms is recorded but it is basically the time for workers to collect enough sample for one iteration.

The _throughput metrics are computed by the timers (see ray.utils.timer._Timer.mean_throughput) as the number of steps loaded or trained per second during the respective operations.

2 Likes

Hi @thomaslecat and thank you so much for your answer. It was really clarifying to me!!

2 Likes

Kind of an iffy question, but what are some good values (in terms of orders of magnitude) for these timers? For example, for a fully connected [64,64] nn, my learn time ms is something like 2000 ms, which feels slow to me. I wasn’t using a gpu, but the network is kind of small, so I’m not sure if my intuition is valid…