Submit remote work to a specific worker

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I am trying to ray workers, each one for a specific GPU. Say there are 4 GPUs, and I would like each ay worker to specifically submit work to one of these.

So from the client script, if there’s some remote work, how to schedule the work on to a specific worker?

Thank you

@esaliya You can use placement groups with appropriate resources so that tasks are assigned placegroups, and workers on nodes that satisfy the placement group resources will be assigned that task.

You can read about the different scheduling techniques and the placement group concepts here.

Thanks Jules. I looked at placement groups. I was not clear how I could keep a dedicated worker for a specific GPU. In this example all 4 GPUs are identical, so resources wise they’ll be the same. However, I want remote task i to be sent to worker i.

Is this something possible?

@esaliya Unless you a cluster with only one worker that has all the GPUs and others none, and specify your placement group, then each task assigned the placement group will only be scheduled to the node with 4 GPUs. You probably, then, won’t parallelize it since the scheduler can only schedule a single task that meets that requirement; others will be queue.

cc: @jjyao Is this possible what he’s asking?

I think the best way is to create actors and you remember some index → actor mapping to send actor task (i.e., rank)

@sangcho Do you want to elaborate, may be, with a short or small example? Thanks!

actors = [Actor.options(num_gpus=1, name=f"rank_{i}").remote() for i in range(10)]

# You can get a specific actor using name
actor = ray.get_actor(name="rank_0")

Thanks @sangcho, but I am a little unclear how to follow your example. Here’s a simplified example of my use case. As shown, I want to invoke task 1 specifically in rank 0 and task 2 in rank 1. Could this be done? I use ranks here to identify workers in the Ray clusters

class Scheduler:
    def __init__(self):
        self.workers = {
           "0": Actor.options(name="rank_0").remote(),
           "1": Actor.options(name="rank_1").remote()
       }
    def call(self, rank):
        self.workers[rank].function.remote()

something like this?