Ray Worker labels

I haven’t see anything like this in the documentation so I believe it does not exist. Would it be possible or even make sense to have ray worker (as in a worker node) have labels that could be defined as an option same as resources?

The would allow work to be redirected to nodes with a specific label. Right now it can be done with custom resources, although seems to be a misuse of the resource functionality, as it’s not really using up any resource just pointing to workers that has a specific set of libraries installed for example.

Hmm this is an interesting use case. Ray worker processes are usually dynamically created and destroyed rather than sitting there forever (and resources are allocated to them on demand, not like each worker has allocated resources). How do you have a specific set of libraries for a specific worker now?

I might not have explained it properly, by worker i mean a worker node. For example consider this snippet of a kubernetes cluster config:

available_node_types:
  head_node:
    node_config:
      apiVersion: v1
      kind: Pod
      metadata:
        generateName: ray-head-
        labels:
          component: ray-head
    resources: {}
  worker_a:
    node_config:
      apiVersion: v1
      kind: Pod
      metadata:
        generateName: ray-worker-a-         
        labels:
          component: ray-worker
    min_workers: 0
    max_workers: 5
    resources: {"CPU": 1, "ResourceA": 1}
    worker_setup_commands:
      - pip install numpy
  worker_b:
    node_config:
      apiVersion: v1
      kind: Pod
      metadata:
        generateName: ray-worker-b-
        labels:
          component: ray-worker
    min_workers: 0
    max_workers: 5
    resources: {"CPU": 1, "ResourceB": 1}

There are 2 types of worker nodes with different custom resources and setup commands and eventually even images. This allows for a task to define that it needs a ResourceB and the cluster will create a worker_b instance if one is not available to run that task. This makes sense if ResourceB is actually a resource, but in a use case where you just want the task to have a different environment it’s just serving for a workaround to what could be a Label to be matched against the available workers.

1 Like

Ah, I see. I definitely misunderstood the question. The following way is actually the recommended way from our end (treat env like resources). We discussed about introducing constraints or env type of resources (which is more straightforward), but it didn’t actually happen at the end. If you think that sort of APIs are more desirable, please create an enhancement request!

@hbfernandes it sounds like you’re doing everything correctly. Is there some functionality you can’t achieve using custom resources?

@Alex
I can achieve what i need using custom resources, it just seems odd that I have to assign a number to the resource when it does not really have a resource count. Are there any special values to assign, such as infinite?

We don’t really have a standard.

Serve uses a total resource value of 1 and requests increments of 0.001, so that’s one datapoint.

The most “special” numbers would probably be MAX_RESOURCE_QUANTITY and MIN_RESOURCE_GRANULARITY (under ray.ray_constants)