Hi,
I want to distribute actors to multiple GPUs.
For example, I have 4 GPU and want to spawn 4 actors.
Then, it should be 1 actor per 1 GPU.
It sounds like simple when I can use the argument “num_gpus=1”.
However, I have other actors to be assigned.
So, let’s assume that I have 4 actors to be distributed and another 4 actors that also use GPU.
When I use num_gpus=0.5, then ray creates 2 actors in 1 GPU.
Maybe, if I use num_gpus=0.6 for actors to be distributed, and num_gpus=0.4 for other 4 actors, I can make it.
However, if the number of actors is not fixed, I cannot use the fixed number.
@Alex
Hi, in this case, the first group of actors and the second group of actors have different execution time.
Therefore, it is better to distribute actors and let all GPUs have same actor configuration.
Is there any update to this? I would like to do the same thing with either remote functions or actors for a data parallel task. And I’m working in a homogenous GPU setup (8 V100s). I know that Ray was mainly meant to improve task parallelism, but I’m interested in exploring ways to accelerate data parallelism with GPUs, especially with the Ray communication library: Ray Collective Communication Lib — Ray 1.12.0. Even in the examples in the documentation, I find that operations are placed only on 1 GPU when I have multiple GPUs available. Naively doing data parallelism with Ray remote functions will be slower due to serialization/deserialization to object store and data transfers from/to GPU/CPU memory without utilizing NVLink connections, especially when communication is needed in algorithms like distributed DGEMM.
For example, if you want say, 4 actors to be on the same GPU, then you could create a placement group with a {"GPU": 1} bundle. Then actors can be targeted at the bundle using GPUActor.options(placement_group=pg, placement_group_bundle_index=0). This allows you to force the actors to use the same GPU.
No other actors will be scheduled onto that bundle (it’s reserved) unless explicitly targeted.
Note that the actors still need to fit in the placement group bundle (total num_gpus <= 1 if you create a 1 GPU bundle). Hope this helps.