PPO configuration parameters: num_rollout_workers & train_batch_size

MRMarlies · October 23, 2023, 8:09am

Hello!

I use the RLlib (Ray 2.6.3), especially the PPO for my task. I have a question regarding the configuration of the PPO, which is still not clear to me.

Is there a connection between these two training variables: “num_rollout_workers” and the “train_batch_size”? For example, when I have two “num_rollout_workers”, do I have to multiply the “train_batch_size” with the number of “num_rollout_workers” in the configuration?

Many thanks for your support in advance!

Greetings,
MRMarlies

Hasnain_Fareed · November 2, 2023, 11:23am

Yes, there is a relationship between num_rollout_workers and train_batch_size in the configuration of PPO in RLlib.

The num_rollout_workers parameter specifies the number of workers that are used for environment sampling. Each of these workers collects samples from the environment in parallel, which can significantly speed up the data collection process.

On the other hand, train_batch_size is the number of samples collected by all rollout workers combined that the algorithm will use for each training iteration.

So, if you have num_rollout_workers=2, it doesn’t mean you have to multiply the train_batch_size by 2. However, you should ensure that train_batch_size is large enough to accommodate the samples collected by all the workers.

In other words, train_batch_size should be greater than or equal to num_rollout_workers * rollout_fragment_length * num_envs_per_worker. This is because each worker collects rollout_fragment_length * num_envs_per_worker samples before sending them to the learner.

Here’s an example from a forum post:

num_gpus = 0
num_gpus_per_worker = 0
num_cpus_for_local_worker = 1
num_cpus_per_worker = 1
num_rollout_workers = 1
rollout_fragment_length = 200
train_batch_size = 200 #must be = rollout_fragment_length * num_rollout_workers * num_envs_per_worker
sgc_minibatch_size = 32

In this example, train_batch_size is set to 200, which is equal to rollout_fragment_length * num_rollout_workers * num_envs_per_worker.

Remember, the train_batch_size is a hyperparameter that you can tune based on your specific problem and computational resources. It doesn’t have to be exactly equal to num_rollout_workers * rollout_fragment_length * num_envs_per_worker, but it should be large enough to accommodate the samples collected by all the workers.

Topic		Replies	Views
[Rllib] Proper number for PPO rollout workers RLlib	2	1707	August 4, 2022
Is set rollout_workers>1 spped up training in normal PPO? RLlib	2	355	May 5, 2023
Num_gpu, rollout_workers, learner_workers, evaluation_workers purpose + resource allocation Configure Algorithm, Training, Evaluation, Scaling	8	2075	August 24, 2023
Total Workers == (Number of GPUS) - 1? Configure Algorithm, Training, Evaluation, Scaling	1	1195	February 9, 2023
[RLlib] Batch size for complete_episodes issue RLlib	6	2146	February 3, 2022

PPO configuration parameters: num_rollout_workers & train_batch_size

Related topics