Determine number of workers used

Low: It annoys or frustrates me for a moment.

Question

I am using a ray.remote function. How can I determine how many workers are created? I have one node with 20 CPUs and I want to make sure that exactly 20 CPUs are being used with 1 worker per CPU.

Background

I know I can do @ray.remote(num_cpus=20), but from this question, it seems it gives a maximum, while the actual number could be less. My goal is to extract the number of workers/CPUs actually being used.

One way I know is to visualize in the dashboard, but I want to do it programmatically (i.e., assert get_num_workers == 20).

Note

This is a continuation of Multiple Ray instances on one node accessing shared memory, and I am specifically interested in the case of the multiply.remote suggested in the solution. However the question is general so the details do not matter.

Ray will schedule tasks and start workers based on how many CPUs are requested by each task. If you use the default @ray.remote decorator, this will work out to 20 workers and 20 CPUs being used concurrently since num_cpus defaults to 1.

Generally you should not need to worry about managing the number of workers yourself, but if you need to check it programmatically for debugging purposes, you can pgrep for ray::. Ray workers will have the process title format ray::<currently executing task name or IDLE>.

1 Like