K8s Ray Specifying GPU/Node Type

Julian_Medina · March 12, 2021, 6:59pm

Saw this discussion here, but it feels like I’m missing the glue for all the appropriate pieces. So I’ll lay out what I know.

Let’s say I want to use the pod given in the docs for use with GPUs:

apiVersion: v1
kind: Pod
metadata:
 generateName: example-cluster-ray-worker
 spec:
  ...
  containers:
   - name: ray-node
     image: rayproject/ray:nightly-gpu
     ...
     resources:
      cpu: 1000m
      memory: 512Mi
     limits:
      memory: 512Mi
      nvidia.com/gpu: 1

Then for the sake of argument, let’s say I’ve added two nodes to my cluster (node a, node b), without labels, that are correctly set for allowing scaling of GPU resources and each has two different types of GPUs, (One with 8GB VRAM - node a, and the other with 20GB - node b).

The question is, how does one specify using ray resources that I want a 20gb GPU from node b rather than any GPU from either node a or b? Does this require adding node labels which ray can look at automagically?

I appreciate the help!

Dmitri · March 17, 2021, 2:26am

I think at some point we should provide a detailed guide for this sort of thing…

K8s part –
To describe a pod that gets scheduled on a node with a particular kind of GPU running on it, you’d probably want to add a taint to the node and a matching toleration to the pod.

Before I mention the Ray part – are you using the Ray Kubernetes operator or the Ray cluster launcher?

dirtyValera · April 18, 2023, 5:52am

Bumping this up, docs do not seem to have info for this extremely useful feature.

@Dmitri can you explain Ray on Kubernetes case?

Topic		Replies	Views
To add new GPU worker to existing CPU cluster Ray Clusters	6	770	July 13, 2022
About kuberay GPU multi-tenancy Kubernetes	0	225	April 19, 2024
Multiple node types with Ray/Kubernetes Kubernetes	3	802	April 18, 2023
How to prevent scheduling non-GPU tasks to GPU nodes Ray Core	6	124	September 30, 2024
Does ray cluster has taint command like k8s?so we can set node not scheduled,only finish aready running task on it Ray Clusters	0	349	August 2, 2023

K8s Ray Specifying GPU/Node Type

Related topics