Assign specific nodes to remote functions in ray cluster

I read the docs to launch ray clusters. However, the example seems not clear on how to assign specific nodes to remote functions. For example, if there are 3 worker nodes and 1 head node and I want to assign some remote functions (with @ray.remote) to one of the worker nodes, how can I achieve it?

You can achieve this with custom resources. By default, Ray creates a custom resource for the node IP so you can do

@ray.remote(resources={"ip:<worker-ip>": 0.001})
def foo(): 
 # This always runs on the same node

That being said, if your goal is really to make sure that certain tasks are colocated, you should consider looking into the PACK or STRICT_PACK placement groups.

https://docs.ray.io/en/latest/ray-core/placement-group.html#strategy-types

Hi, thanks. I created a Ray k8s cluster. Can I assign the node’s pod name? For example:

@ray.remote(resources={"RAY_CLUSTER_POD_0_NAME": 0.001})
def foo1():
    ...

@ray.remote(resources={"RAY_CLUSTER_POD_1_NAME": 0.005})
def foo2():
    ...

Two potential forks for discussion:

  1. What’s the intended use-case? Why do you need to assign tasks to specific pods? Instinct tells me that this is an anti-pattern – it’s subverting the Ray scheduler’s function.
  2. What configs are you using to launch your Ray cluster on K8s? (Asking because there’s currently more than one stack for deploying Ray on K8s.)

Dear Dmitri, thank you. The following are my considerations, please correct me if there is something wrong

  1. What’s the intended use-case? Why do you need to assign tasks to specific pods? Instinct tells me that this is an anti-pattern – it’s subverting the Ray scheduler’s function.

A1: I created many RL environment actors. I hope these actors can be run in remote CPU pods, say 20 pods. So, if I can get the resources, I can explicitly assign the pods for all the actors.

  1. What configs are you using to launch your Ray cluster on K8s? (Asking because there’s currently more than one stack for deploying Ray on K8s.)

A2: I followed this tutorial to launch my clusters.

Let’s say the pods each have 16CPUs
If you’d like run one actor per pod, you’d define the actors with

@ray.remote(num_cpus=16)
class Actor
...

The Ray scheduler will then schedule one actor per pod, since a 16CPU Pod can fit only one actor requesting 16CPUs.

If you’re scheduling 20 of these actors, you can allocate them with
actors = [Actor.remote() for _ in range(20)]

Alternatively you can use a STRICT_SPREAD placement group to make sure that each actor is scheduled in its own pod.
https://docs.ray.io/en/latest/ray-core/placement-group.html

Thank you. I see, Ray will auto schedule tasks to all pods.