import time
import numpy as np
import ray
ray.init(_node_ip_address='0.0.0.0')
@ray.remote
def f2(x):
import time
st = time.time()
time.sleep(1)
return st
it_num = 100
st = time.time()
result = ray.get([f2.remote(i) for i in range(it_num)])
print(result)
print(max(result), min(result))
print(np.average(np.array(result) - st))
ray.shutdown()
Above is my code to test the time of dispatching tasks.
I ran code with it_num
in 1, 10, 100
And I found avg result that 0.007524728775024414, 0.008671426773071289, 3.743643021583557
.
It was rising too quickly from 10 to 100.
So I really wanted to know why this happened?
Pls tell me.
Ps, my laptop is macbook pro with 6-core i9
Thx a lot
It’s because each task reserves a cpu, and while your cpus are full, you cannot schedule more tasks. So, if your machine has 16 cpus
1 => 1 cpu, done instantly
10 => 10 cpus, done instantly
100 => 16 cpus → 1 second → next 16 cpus → 1second → repeats…
Thx, sangcho
So, is there any method to make ray task sharing with 1 cpu core? Like multithreading
When you specify the num_cpus, it doesn’t really mean it actually reserves 1 cpu. It is just the scheduling hint. You can probably just use num_cpus=0.5 or num_cpus=0.25. Note that the value should be the clean division of 1 (0.5, 0.25…) if it is less than 1.
Note that this means you will run more processes (e.g., if you have 16 cpus, and if each task requires 0.5 cpus, it can start 32 processes). Multi threading usually doesn’t really benefit workload in python due to GIL. If you’d like to do this because of IO intensive work, I’d recommend you to use async actor instead. AsyncIO / Concurrency for Actors — Ray v2.0.0.dev0
THX sangcho. that’s worked