Is set rollout_workers>1 spped up training in normal PPO?

How severe does this issue affect your experience of using Ray?

  • None: Just asking a question out of curiosity
  • Low: It annoys or frustrates me for a moment.
  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.
  • High: It blocks me to complete my task.

Hello guys, I was wondering is the number of rollout_workers linearly related to the number of sampling timesteps per iter? When I set num_rollout_workers=2 and num_rollout_workers=16 during training, the number of samples per iter did not change much (specifically, more workers did not lead to a larger increase in the number of samples). Moreover, there is no significant difference in the training time taken by each iter. I wonder what is the cause of this problem? Or have I misunderstood what rollout_worker does?(The image is shown below,Set num_rollout_workers=2 or 16, and episode_this_iter will be between 150 and 200, and each iter will take roughly 250s-300s of training time)

Moreover, during training, there will be multiple identical training messages (as shown below).


I wonder if this is a normal situation?

With PPO, you have two phases between which training alternates. Sampling and learning.
You can only speed up sampling because rollout workers do the rollouts of your policy.
So sampling speed should behave linear roughly speaking. There are other factors that influence this. Obviously, you can’t scale this up linearly indefinitely.

PPO collects samples and waits until it has collected a certain amount of samples until it moves on to train on these samples. Increasing rollout workers will not increase the number of samples usually (only if there is some small random overhead - after that small overhead PPO will move to training). Since PPO is an on-policy algorithm, it makes no sense to collect crazy large amounts of samples if you are only going to train on your “small” train batch size anyways.

Thanks for your reply, it deepened my understanding of the sampling mechanism.