On-premise cluster: different worker node types

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hello,

I’ve an On-premise ray cluster (i.e. provider.type: local). I’d like to declare different worker node types in the cluster (e.g. “CPU only nodes” and “GPU nodes”). I tried adding an available_node_type section in the cluster configuration as follows.

# Rest excluded for brevity

provider:
    type: local
    head_ip: 10.1.0.1
    worker_ips: [10.1.0.2,10.1.0.3,10.1.0.4]

# Rest excluded for brevity

available_node_types:
    head_node:
        min_workers: 0
        max_workers: 0
        resources: {"CPU": 2}
    cpu_node:
        min_workers: 1
        max_workers: 1
        resources: { "CPU": 6}
    gpu_node:
        min_workers: 2
        max_workers: 2
        resources: { "CPU": 6, "GPU": 1}
head_node_type: head_node

# Rest excluded for brevity

When I ran ray up to start the cluster, I got the following error:

The field available_node_types is not supported for on-premise clusters.

Is there way to declare different node types on on-premise clusters?

The workaround that I’m thinking about ATM is to create a Ray cluster for each worker node type and have something external (to Ray) to schedule the workloads to the correct Ray cluster.

Thanks

1 Like

available_node_type is not supported for static on-prem clusters
Could you explain why you need it?
You should be able to just specify the worker_ips.

Ray should detect node resources correctly – if that’s not the case, then we need an interface for per-node resource-overrides.

Hi Dmitri, and sorry for the late reply.

I have two types of workload that I’m planning to run on the Ray cluster:

  1. “CPU only” workload: CPU only, long running, high latency is acceptable (i.e. from when a job is submitted until it starts executing)
  2. “GPU” workload: runs on GPU, short running, low latency is required.

There are two types of nodes that I can use as cluster nodes:

  1. CPU only nodes
  2. Nodes that have GPUs

Without a way to force the “CPU only” workload to run on the “CPU only” nodes, there is a risk to end up on the following unwanted situation:

  • Several “CPU only” jobs (workload type 1) might end up using all the cluster nodes (including the “GPU” ones)
  • New “GPU” jobs will experience very high latency, since the whole cluster (including the GPU workers) is occupied by “CPU only” workload.

@mohhaseeb Did you solve it? I also had the same problem.

1 Like

Unfortunately not @Thien_Nguyen

1 Like

We are running into a similar problem - except ours revolves around trying to launch multiple worker nodes on the same server in an effort to isolate resources.

Right now, some of the nodes in our cluster have 32 cpus and 3 gpus. If we try to let the node be a single worker with 32 cpus and 3 gpus, then any task that needs 1 cpu and 1 gpu will take a worker and start a pytorch process on one of the gpus. As the node processes multiple trainings, it starts handing trains to any of the 32 cpus, which each cache pytorch mem on one of the 3 gpus (which is a performance feature of pytorch). This leads to an OOM on the gpus since all 32 cpus have reserved their own mem on the 3 gpus.

One way around this is to use PYTORCH_NO_CUDA_MEMORY_CACHING=1 but it results in severe performance degradation (it is really only designed for debugging purposes) by up to 50%.

So the final solution seems to be that we need to isolate resources by creating two workers on a single node: one “trainer” worker that has 3 cpu and 3 gpus, while another “cpu” worker with resources set to 29 cpus so that single cpu tasks can still be highly parallelized.