[RLlib] Effect of num_cpus_for_driver?

Kai_Yun · February 23, 2021, 8:24am

I’m using RLlib DQN to train a model on a custom simulator.
I’ve been playing around with it and can clearly see how using multiple num_workers and num_cpus_per_worker benefits since I can collect more data much faster. However, I cannot seem to find any benefit in increasing num_cpus_for_driver. There’s really no difference between setting it to 1 or 10 in my case. I’m using Tune.run to run the training, so I assume it should be in effect as suggested in the documentation: “Number of CPUs to allocate for the trainer. Note: this only takes effect when running in Tune. Otherwise, the trainer runs in the main program.”
What should I expect by increasing num_cpus_for_driver? Thanks in advance!

rliaw · February 24, 2021, 6:54pm

num_cpus_per_driver will be most important if your models are on CPU. If you are training models on CPUs, you will probably want to reserve some cores for the model-training part (the driver) to operate without thread thrashing.

Topic		Replies	Views
Training and inference ONLY using GPUs and no CPUs RLlib	7	1863	April 12, 2021
Pytorch dataloader num_workers with ray tune RLlib	2	50	May 6, 2025
Cpu allocation confusion	3	1349	March 7, 2023
Most efficient way to use only a CPU for training RLlib	3	3105	April 22, 2021
How many workers? Best way to determine number of workers? RLlib	3	1995	January 3, 2023

[RLlib] Effect of num_cpus_for_driver?

Related topics