How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hi, I am wondering what is the actual difference when using different num_gpus
. If I set num_gpus=0.1
, there will be 10 processes in GPU (observed in nvidia-smi), but according to Running more than one CUDA applications on one GPU - Stack Overflow , different processes cannot run parallelly.
I use the follwing function with num_gpus=0.1
and num_gpus=1
, the result shows that the former cost less time, 2.5s vs 8s. How can it gain performance improvement since functions cannot run in parallel? I guess it is because when num_gpus=0.1
, the time of creating cuda context is overlap. What do you think?
@ray.remote(num_gpus=0.1)
def f(datatype):
# Do some work
x = np.random.random(1000000,datatype)
for i in range(1):
x += 1
return x