[RaySGD] how to utilise num_cpus_per_worker best way?

Subhabrata_Banerjee · February 23, 2021, 8:18am

I am using RaySGD TFTrainer

3 replica & num_cpus_per_worker=1 batch_size=128
3 replica & num_cpus_per_worker=3 batch_size=128

not seeing any significant improvements, both are taking same time , how we can utilise best way num_cpus_per_worker

another thing I observed ,
1 replica & num_cpus_per_worker=1 batch_size=128
1 replica & num_cpus_per_worker=3 batch_size=128 (taking more time as compare to earlier one, only using on single PID[is it right behaviour??])

Env:
Ray v1.0.0
python - 3.8
TF 2.4.1
4 node cluster 6 core each

rliaw · February 24, 2021, 6:52pm

num_cpus_per_worker is just a resource specification – it actually won’t change anything.

You should make sure num_cpus_per_worker == DataLoader(num_workers).

Topic		Replies	Views
Not sure how num_replicas works Ray Serve	5	1740	March 4, 2021
Ray SGD distributed tensorflow Ray Train	8	718	December 17, 2020
Most efficient way to use only a CPU for training RLlib	3	3145	April 22, 2021
How does multi-CPU work within Ray? Ray Core	4	708	April 21, 2021
Best way to config ray workers Ray Core	6	456	February 26, 2021

[RaySGD] how to utilise num_cpus_per_worker best way?

Related topics