Cannot figure out why number of processes less than number of available CPUs

xzf0kgb0bqr.cev2RWU · November 29, 2022, 6:00am

- Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I am using one node. I do:

ray start --head --dashboard-host “0.0.0.0”
ray status

And in the usage section, I see:

0.0/28.0 CPU

Great, because I have 28 CPUs.

Now I do:

ray stop

And I see

Stopped all 7 Ray processes.

As I understand it, Ray should make 1 process for each worker (see the question here for example). I also double checked this by running a job, checking the dashboard and confirming that at most there only 7 workers.

i) Why are there not 28 processes?
ii) How can I make Ray use 28 processes (i.e., one process per CPU)?

Stephanie_Wang · November 29, 2022, 9:57pm

The processes you are seeing are long-lived system-level processes that outlive individual Ray jobs. Ray automatically starts workers as needed based on submitted tasks and their resource requirements. Since there are no active jobs, there won’t be any workers.

Once you do submit a job, you should see worker processes start up. The actual number of worker processes is usually ~number of CPUs, but can be more or less depending on worker crashes and specific workload. There are two ways to submit a job:

Run a Python script that calls ray.init(). If you can access the head node, this is the simplest
Submit a packaged Ray job.

xzf0kgb0bqr.cev2RWU · November 30, 2022, 10:22pm

Very helpful. Thank you!!

Topic		Replies	Views
Specify the number of worker processes Ray Core	10	16035	May 12, 2021
Ray job start up too slow on workers	0	513	September 11, 2022
[Core] Ray start more workers than cpus Ray Core	5	1871	April 10, 2021
Understanding task and worker allocation Ray Core	3	1442	September 21, 2021
Too many pyhton processes on Node Ray Clusters	2	324	January 18, 2023

Cannot figure out why number of processes less than number of available CPUs

Related topics