[Rllib] Proper number for PPO rollout workers

How severe does this issue affect your experience of using Ray?

  • None: Just asking a question out of curiosity

How to determine the best number for PPO rollout workers?

I want to use parallelization with the PPO Trainer class and I am wondering what a proper number for num_workers would be and especially if there can be too many workers.

Should I just max out my machine and set the num_workers to the number of cpu cores (24) minus one for the local worker? Or can there be too many workers in a sense that the local worker receives too many policy updates at the same time which even makes training less efficient? And are there any techniques to “tune” the number of workers or maybe rule of thumbs of what a proper number would be or does this strongly depend on the respective environment and specific model and configuration?

Grateful for any advise!

Here is a helpful rule of thumb: Training APIs — Ray 1.13.0

Here is a similar issue where I ask a question about what seems to be performance slow down wrt number of workers (unfortunately have not had time to explore this more): Num workers speedup?

I suggest you perform a few scaling studies to see what works well for your computer+algorithm+simulation.

I learned that OpenAI 5 used 57,600 rollout workers. So running like 20-50 workers should definitely not be a problem I guess.
Open AI 5 Dota 2 Paper