How to set correct num_cpu?

Hi, I’m running ray on a compute node with 128 cores directly and met some problems of setting num_cpus.

Example

import ray
import time

@ray.remote(num_cpus=50)
def test_ray(num):
    print(num)
    time.sleep(5)


if __name__ == '__main__':
    futures = [test_ray.remote(i) for i in range(200)]
    ray.get(futures)

Computer info

$ cat /proc/meminfo |grep MemTotal
MemTotal:       528083404 kB

$ nproc --all
128

$ uname -a
Linux node1 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Problems

When I don’t set num_cpus, it works well:

(test_ray pid=139395) 107
(test_ray pid=139397) 106
(test_ray pid=139405) 105
(test_ray pid=139398) 104
(test_ray pid=139399) 103
(test_ray pid=139404) 102
(test_ray pid=139409) 101
(test_ray pid=139402) 99
(test_ray pid=139400) 100
(test_ray pid=139401) 98
(test_ray pid=139406) 97
(test_ray pid=139407) 96
(test_ray pid=139408) 95
(test_ray pid=139411) 93
(test_ray pid=139410) 94
(test_ray pid=139413) 92
(test_ray pid=139412) 91
(test_ray pid=139415) 90
(test_ray pid=139422) 89
....
...

When I set num_cpus=50, it only use two cores:

(test_ray pid=142874) 0
(test_ray pid=142872) 1
(test_ray pid=142874) 3
(test_ray pid=142872) 2
....

When I increase it to 100, that would be running one by one.
Is there any principle of setting num_cpus?

hi @zxdawn
unless your task requires multiple cpus (like running multiple threads), you probably should set the num_cpus = 1

Hi @Chen_Shen , yes, I need it running in multiple threads.

Ray schedule tasks based on the available resources and the task resource requirement; in the case you want to increase the parallelism (for the number of concurrent tasks), you probably need to set the num_cpus to lower number.

for example if you have 128 available cores and you set num_cpus=50 for each task, the max parallelism you can expect is 128 // 50 = 2.

1 Like

Ha, got it! I misunderstood the num_cpu. I thought it’s the CPUs used for parallelism and each CPU deal with one job.