How do Ray actors share a GPU?


I see here that it seems possible to allocate fractions of GPU to a Ray actor with something like @ray.remote(num_gpus=0.25).

If N actors are on the same GPU, how do they run?

If it’s A, then it’s pretty revolutionary, as I think only MPS, cuda streams or MIG enable true concurrency on NVIDIA GPUs. If it’s B, then I encourage putting (A) in the roadmap to make Ray even more appealing.

I think it is just (B) Ray’s role is just to assign a task to where GPU is available and specify the environment variable to use GPUs.

cc @kai can you confirm if this is correct?

Confirmed by @rkn here python - Ray: How to run many actors on one GPU? - Stack Overflow