Assign specific nodes to remote functions in ray cluster

GoingMyWay · June 7, 2022, 5:02pm

I read the docs to launch ray clusters. However, the example seems not clear on how to assign specific nodes to remote functions. For example, if there are 3 worker nodes and 1 head node and I want to assign some remote functions (with @ray.remote) to one of the worker nodes, how can I achieve it?

Alex · June 7, 2022, 6:47pm

You can achieve this with custom resources. By default, Ray creates a custom resource for the node IP so you can do

@ray.remote(resources={"ip:<worker-ip>": 0.001})
def foo(): 
 # This always runs on the same node

That being said, if your goal is really to make sure that certain tasks are colocated, you should consider looking into the PACK or STRICT_PACK placement groups.

https://docs.ray.io/en/latest/ray-core/placement-group.html#strategy-types

GoingMyWay · June 8, 2022, 1:28am

Hi, thanks. I created a Ray k8s cluster. Can I assign the node’s pod name? For example:

@ray.remote(resources={"RAY_CLUSTER_POD_0_NAME": 0.001})
def foo1():
    ...

@ray.remote(resources={"RAY_CLUSTER_POD_1_NAME": 0.005})
def foo2():
    ...

Dmitri · June 8, 2022, 8:30am

Two potential forks for discussion:

What’s the intended use-case? Why do you need to assign tasks to specific pods? Instinct tells me that this is an anti-pattern – it’s subverting the Ray scheduler’s function.
What configs are you using to launch your Ray cluster on K8s? (Asking because there’s currently more than one stack for deploying Ray on K8s.)

GoingMyWay · June 8, 2022, 9:09am

Dear Dmitri, thank you. The following are my considerations, please correct me if there is something wrong

What’s the intended use-case? Why do you need to assign tasks to specific pods? Instinct tells me that this is an anti-pattern – it’s subverting the Ray scheduler’s function.

A1: I created many RL environment actors. I hope these actors can be run in remote CPU pods, say 20 pods. So, if I can get the resources, I can explicitly assign the pods for all the actors.

What configs are you using to launch your Ray cluster on K8s? (Asking because there’s currently more than one stack for deploying Ray on K8s.)

A2: I followed this tutorial to launch my clusters.

Dmitri · June 8, 2022, 9:17am

Let’s say the pods each have 16CPUs
If you’d like run one actor per pod, you’d define the actors with

@ray.remote(num_cpus=16)
class Actor
...

The Ray scheduler will then schedule one actor per pod, since a 16CPU Pod can fit only one actor requesting 16CPUs.

Dmitri · June 8, 2022, 9:20am

If you’re scheduling 20 of these actors, you can allocate them with
actors = [Actor.remote() for _ in range(20)]

Dmitri · June 8, 2022, 9:23am

Alternatively you can use a STRICT_SPREAD placement group to make sure that each actor is scheduled in its own pod.
https://docs.ray.io/en/latest/ray-core/placement-group.html

GoingMyWay · June 8, 2022, 10:54am

Thank you. I see, Ray will auto schedule tasks to all pods.

Topic		Replies	Views
How to create @ray.remote jobs that will only run on the workers from the local node? Ray Core	6	2217	May 9, 2021
Ray.remote() on a specific worker Ray Core	2	514	January 6, 2022
How to run a function exactly once on each node? Ray Core	4	2166	May 18, 2021
Ray Cluster: ensure each new task goes to different node Ray Core	1	205	February 8, 2021
All remote tasks are scheduled to one node Ray Clusters	0	426	August 28, 2021

Assign specific nodes to remote functions in ray cluster

Related topics