Tasks with a fractional Custom Resource requirement always launch a new pod

ayushraj · April 10, 2024, 10:41am

I create a new ray cluster in Kubernetes with following command :
helm install raycluster kuberay/ray-cluster --version 1.1.0 -f values.yaml --debug
I have the following values.yaml file.

image:
  repository: rayproject/ray
  tag: 2.10.0
  pullPolicy: IfNotPresent

nameOverride: "kuberay"
fullnameOverride: ""

imagePullSecrets: []

common:
  containerEnv: {}
head:
  rayVersion: 2.10.0
  enableInTreeAutoscaling: true
  autoscalerOptions:
    upscalingMode: Default
    idleTimeoutSeconds: 20
    imagePullPolicy: Always
    securityContext: {}
    env: []
    envFrom: []
    resources:
      limits:
        cpu: "500m"
        memory: "512Mi"
      requests:
        cpu: "500m"
        memory: "512Mi"
  labels: {}
  serviceAccountName: ""
  rayStartParams:
    dashboard-host: '0.0.0.0'
    num-cpus: 0
  containerEnv: []
  envFrom: []
  resources:
    limits:
      cpu: "1"
      memory: "2G"
    requests:
      cpu: "1"
      memory: "2G"
  annotations: {}
  nodeSelector: {}
  tolerations:
  - effect: NoSchedule
    key: kubernetes.azure.com/scalesetpriority
    operator: Equal
    value: spot
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.azure.com/scalesetpriority
            operator: In
            values:
            - spot
  securityContext: {}
  volumes:
    - name: log-volume
      emptyDir: {}
  volumeMounts:
    - mountPath: /tmp/ray
      name: log-volume
  sidecarContainers: []
  command: []
  args: []
  headService: {}
worker:
  groupName: workergroup
  replicas: 1
  minReplicas: 0
  maxReplicas: 3
  labels: {}
  serviceAccountName: ""
  rayStartParams:
    resources: '"{\"default-worker-group-node\": 1}"'
  containerEnv: []
  envFrom: []
  resources:
    limits:
      cpu: "1"
      memory: "1G"
    requests:
      cpu: "1"
      memory: "1G"
  annotations: {}
  nodeSelector: {}
  tolerations:
  - effect: NoSchedule
    key: kubernetes.azure.com/scalesetpriority
    operator: Equal
    value: spot
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.azure.com/scalesetpriority
            operator: In
            values:
            - spot
  securityContext: {}
  volumes:
    - name: log-volume
      emptyDir: {}
  volumeMounts:
    - mountPath: /tmp/ray
      name: log-volume
  sidecarContainers: []
  command: []
  args: []

service:
  type: LoadBalancer

I want to understand why does the following code launches 2 worker pods ?
The custom resource requirement by a task is 0.2 only but the worker pods start with 1 unit of the custom resource (default-worker-group-node).
The same code finishes the execution in a single pod when i decorate the tasks with @ray.remote(num_cpus=0.2)
Is this the correct behaviour when using custom resource for scheduling ?

import time
import ray

ray.init(address="ray://<ray-head-service-ip>:10001")
print(ray.cluster_resources())

@ray.remote(resources={"default-worker-group-node": 0.2})
def hello_world():
    time.sleep(130)
    return "Local machine says hello to the remote cluster "

@ray.remote(resources={"default-worker-group-node": 0.2})
def hello_world_small():
    time.sleep(60)
    return "Local machine says hello to the remote cluster : hello_world_small"

start = time.time()
a = hello_world.remote()
time.sleep(30)
b = hello_world_small.remote()
print(ray.get(a))
print(ray.get(b))
end = time.time()

print("Program ran for", end-start, "seconds")
ray.shutdown()

Topic		Replies	Views
How to assign different custom resources for each worker nodes? Ray Clusters	9	2231	July 28, 2022
Scale up from 0 Ray Clusters	7	565	July 15, 2021
About the Ray Clusters category Ray Clusters	2	1086	July 22, 2022
Assign specific nodes to remote functions in ray cluster Ray Clusters	8	1856	June 8, 2022
How can I custom resource after ray cluster start Ray Clusters	0	37	September 12, 2024

Tasks with a fractional Custom Resource requirement always launch a new pod

Related topics