Is set rollout_workers>1 spped up training in normal PPO?

pkgunboat · April 26, 2023, 2:10am

How severe does this issue affect your experience of using Ray?

None: Just asking a question out of curiosity
Low: It annoys or frustrates me for a moment.
Medium: It contributes to significant difficulty to complete my task, but I can work around it.
High: It blocks me to complete my task.

Hello guys, I was wondering is the number of rollout_workers linearly related to the number of sampling timesteps per iter? When I set num_rollout_workers=2 and num_rollout_workers=16 during training, the number of samples per iter did not change much (specifically, more workers did not lead to a larger increase in the number of samples). Moreover, there is no significant difference in the training time taken by each iter. I wonder what is the cause of this problem? Or have I misunderstood what rollout_worker does?(The image is shown below,Set num_rollout_workers=2 or 16, and episode_this_iter will be between 150 and 200, and each iter will take roughly 250s-300s of training time)

Moreover, during training, there will be multiple identical training messages (as shown below).

I wonder if this is a normal situation?

arturn · April 27, 2023, 4:45am

With PPO, you have two phases between which training alternates. Sampling and learning.
You can only speed up sampling because rollout workers do the rollouts of your policy.
So sampling speed should behave linear roughly speaking. There are other factors that influence this. Obviously, you can’t scale this up linearly indefinitely.

PPO collects samples and waits until it has collected a certain amount of samples until it moves on to train on these samples. Increasing rollout workers will not increase the number of samples usually (only if there is some small random overhead - after that small overhead PPO will move to training). Since PPO is an on-policy algorithm, it makes no sense to collect crazy large amounts of samples if you are only going to train on your “small” train batch size anyways.

pkgunboat · May 5, 2023, 7:25am

Thanks for your reply, it deepened my understanding of the sampling mechanism.

Topic		Replies	Views
PPO configuration parameters: num_rollout_workers & train_batch_size Configure Algorithm, Training, Evaluation, Scaling	1	781	November 2, 2023
Increasing the number of rollout worker doesn´t increase the performance Configure Algorithm, Training, Evaluation, Scaling	0	219	December 24, 2023
[Rllib] Proper number for PPO rollout workers RLlib	2	1707	August 4, 2022
Convergence Time and num_workers RLlib	2	393	January 26, 2022
Num_gpu, rollout_workers, learner_workers, evaluation_workers purpose + resource allocation Configure Algorithm, Training, Evaluation, Scaling	8	2075	August 24, 2023

Is set rollout_workers>1 spped up training in normal PPO?

Related topics