Ray on single machine. No threading?

Chirag · April 1, 2021, 3:22am

Hi,

I was experimenting with Ray job scheduling and had this basic doubt. Consider this simple program:

_____________________________
@ray.remote
def f(x):
    time.sleep(5)
    return(x+1)

x = f.remote(1)
y = f.remote(2)

print(ray.get(x))
print(ray.get(y))

__________________________

I thought this program will take ~5 sec, because each of x and y will be an independently running worker. In which case both x and y will be available at the end of about 5 sec

However, it takes 10 sec, implying that first x is computed and only then computing y starts.

Is this correct interpretation? Does this mean Ray doesn’t spawn different threads for workers?

Also, any good reading to get a quick overview of such topics in Ray?

Thanks,
Chirag

architkulkarni · April 1, 2021, 4:45am

Hi @Chirag ,

That’s odd, your code sample should run in 5 seconds. If you run ray.available_resources(), do you see at least two CPUs available?

Chirag · April 1, 2021, 5:04am

Hi @architkulkarni , thanks for the reply.

Yes it does. It shows - ‘CPU’: 2.0

As an aside, does number of cores matter? Can we not have threads like in python threading?

Thanks,
Chirag

sangcho · April 1, 2021, 5:50am

I ran the same script and it was returned within 5 seconds.

import ray
ray.init()
@ray.remote
def f(x):
    time.sleep(5)
    return(x+1)

import time
start = time.time()
x = f.remote(1)
y = f.remote(2)

print(ray.get(x))
print(ray.get(y))
print(time.time() - start)

How did you connect to the ray cluster? Did you use ray start and ray.init(address=‘auto’)?

Also, tasks are scheduled at each process. We don’t have a first-class threading support. (Also note that threading is not always achieving parallelism for Python due to GIL except some IO heavy workload or using numpy).

sangcho · April 1, 2021, 5:51am

If threading support is important for your use case, please create a feature request to our Github page!

Chirag · April 1, 2021, 4:41pm

Thank you @sangcho !

I realized that the remote VM I was working with had just 1 CPU. In replying to Archit, I didn’t realize that there were two nodes running ray on my cluster (Hence showing CPU:2). Sorry for the confusion.
The thing worked in 5 sec on running it on my local 4 core machine.

Yes, I understand threading isn’t really parallel. I was assuming that there was an abstraction of parallelism among workers irrespective of no of CPUs. Thank you for the clarification.

So for now, number of (apparently) parallel tasks is equal to number of CPUs in the cluster, right?

Thanks,

sangcho · April 1, 2021, 5:01pm

Yes that’s correct! But you can actually run more processes than number of cpus by just setting high num-cpus (num cpus don’t have to match to the real number of cpus). For actors, we technically support threading (with the argument max_concurrency), but it is not really recommended to use for Python (but for Java, we run many threads for each worker process).

Chirag · April 1, 2021, 5:16pm

Got it! Thanks for the prompt and detailed reply @sangcho !

Chirag · April 1, 2021, 6:49pm

For completeness: setting num-cpus seems to support threading.

On a one core machine:

On initialising with ray start --head --num-cpus=1, above program takes ~10sec
On initialising with ray start --head --num-cpus=2, above program takes ~5sec

sangcho · April 2, 2021, 12:09am

Oh I think that’s because the process that calls sleep is not scheduled by cpu, (so that it looks like there’s a parallelism here). It is similar to scenario you have many IO ops with python multi threads.

Chirag · April 2, 2021, 5:36pm

@sangcho Thanks. Yea, that’s probably a bad example. I meant increasing num-cpus gives threading behaviour (may not be useful, but just for pedagogical reason:p). Here’s probably better example:

@ray.remote
def f(a):
    print('entering...')
    d = a*a
    for i in range(50):
    	d = d*a	
    print('middle...')
    for i in range(50):
    	d = d*a
    return 0

a = np.random.rand(3000,3000)

start = time.time()
x = f.remote(a)
y = f.remote(a)

print(ray.get(x))
print(ray.get(y))
print(time.time() - start)

Output with 1 single core node and num-cpus=1
Screenshot 2021-04-02 at 12.28.21 PM

Output with 1 single core node and num-cpus=2
Screenshot 2021-04-02 at 12.28.50 PM

Output with 2 single core nodes:

Thanks.

Topic		Replies	Views
Is there any method to make ray task sharing with 1 cpu core? Like multithreading Ray Core	4	763	July 8, 2021
Low CPU utilization when compared to multiprocessing Ray Core	14	1594	June 1, 2023
CPU cores, CPU threads, and scaling of Ray tasks Ray Core	1	236	June 25, 2024
How to effectively use parallelization with Ray in Python? Ray Core	5	4202	December 2, 2022
Ray actor only uses one core on a cluster managed using SLURM Ray Clusters	1	436	September 16, 2021

Ray on single machine. No threading?

Related topics