System config parameters on ray worker

Hi there!

I’m trying to apply system settings like '{"object_spilling_threshold": 0.99}' to ray-worker.

If it is stored in the yaml for ray-worker in the rayStartParams section, it is raised:

ValueError: System config parameters can only be set on the head node.

If the setting exists for the head, for example:

        rayStartParams:
         dashboard-agent-listen-port: "52365"
         metrics-export-port: "9001"
         ...
         system-config: "'{\"object_spilling_threshold\":0.99}'"

The worker does not apply system-config parameters.

The question is: how to apply system-config parameters on ray workers?

To be more clear, I am trying to set up a cluster in k8s with 1 head node and multiple worker nodes.

The Best practices for deploying large clusters — Ray 2.3.1 recommends to avoid scheduling additional tasks on head node.

So, my application needs to store in memory constantly learning weights that are passed to serve actors via shm. To this end, I allocate enough memory for object storage on the worker node. But the raylet starts the splitting process instantly as soon as the object storage capacity crosses 0.8 full threshold.

This situation is very inconvenient for serving my models, since I need to keep +20% of the memory, which will be almost free all the time.

I hope I miss something in the docs.

(post deleted by author)

Configuration details:

Head node:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    cni.projectcalico.org/podIP: *
    cni.projectcalico.org/podIPs: *
    ray.io/external-storage-namespace: *
    ray.io/ft-enabled: "true"
    ray.io/health-state: ""
  creationTimestamp: "2023-04-05T22:13:39Z"
  generateName: kuberay-cluster-head-
  labels:
    app.kubernetes.io/created-by: kuberay-operator
    app.kubernetes.io/instance: kuberay-cluster
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kuberay
    helm.sh/chart: ray-cluster-0.4.0
    ray.io/cluster: kuberay-cluster
    ray.io/cluster-dashboard: kuberay-cluster-dashboard
    ray.io/group: headgroup
    ray.io/identifier: kuberay-cluster-head
    ray.io/is-ray-node: "yes"
    ray.io/node-type: head
  name: kuberay-cluster-head-zmvvx
  namespace: kuberay
  ownerReferences:
  - apiVersion: ray.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: RayCluster
    name: kuberay-cluster
    uid: *
  resourceVersion: "275448686"
  uid: *
spec:
  affinity: {}
  containers:
  - args:
    - 'ulimit -n 65536; ray start --head  --system-config=''{"object_spilling_threshold":0.99}''  --block  --memory=8000000000  --dashboard-agent-listen-port=52365  --dashboard-host=0.0.0.0  --metrics-export-port=9001  --num-cpus=0 '
    command:
    - /bin/bash
    - -lc
    - --
    env:
    - name: RAY_ROTATION_BACKUP_COUNT
      value: "1"
    - name: RAY_ROTATION_MAX_BYTES
      value: "134217728"
    - name: RAY_GRAFANA_HOST
      value: 8
    - name: RAY_REDIS_ADDRESS
      value: *
    - name: RAY_IP
      value: 127.0.0.1
    - name: RAY_PORT
      value: "6379"
    - name: RAY_ADDRESS
      value: 127.0.0.1:6379
    - name: RAY_USAGE_STATS_KUBERAY_IN_USE
      value: "1"
    - name: REDIS_PASSWORD
    - name: RAY_external_storage_namespace
      value: *
    image: rayproject/ray:2.3.0-py39-cu113
    imagePullPolicy: IfNotPresent
    lifecycle:
      preStop:
        exec:
          command:
          - /bin/sh
          - -c
          - ray stop
    livenessProbe:
      exec:
        command:
        - bash
        - -c
        - wget -T 2 -q -O- http://localhost:52365/api/local_raylet_healthz | grep
          success
        - '&&'
        - bash
        - -c
        - wget -T 2 -q -O- http://localhost:8265/api/gcs_healthz | grep success
      failureThreshold: 40
      initialDelaySeconds: 10
      periodSeconds: 3
      successThreshold: 1
      timeoutSeconds: 1
    name: ray-head
    ports:
    - containerPort: 10001
      name: client
      protocol: TCP
    - containerPort: 8265
      name: dashboard
      protocol: TCP
    - containerPort: 8000
      name: ray-serve
      protocol: TCP
    - containerPort: 52365
      name: dashboard-agent
      protocol: TCP
    - containerPort: 6379
      name: redis
      protocol: TCP
    - containerPort: 9001
      name: http-metrics
      protocol: TCP
    - containerPort: 8080
      name: metrics
      protocol: TCP
    readinessProbe:
      exec:
        command:
        - bash
        - -c
        - wget -T 2 -q -O- http://localhost:52365/api/local_raylet_healthz | grep
          success
        - '&&'
        - bash
        - -c
        - wget -T 2 -q -O- http://localhost:8265/api/gcs_healthz | grep success
      failureThreshold: 20
      initialDelaySeconds: 10
      periodSeconds: 3
      successThreshold: 1
      timeoutSeconds: 1
    resources:
      limits:
        cpu: "4"
        ephemeral-storage: 1G
        memory: 8G
      requests:
        cpu: "4"
        ephemeral-storage: 1G
        memory: 8G
    securityContext: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /tmp/ray
      name: log-volume
    - mountPath: /fluent-bit/etc/fluent-bit.conf
      name: fluentbit-config
      subPath: fluent-bit.conf
    - mountPath: /dev/shm
      name: shared-mem
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: *
      readOnly: true
  - image: fluent/fluent-bit:1.9.6
    imagePullPolicy: IfNotPresent
    name: fluentbit
    resources:
      limits:
        cpu: 100m
        ephemeral-storage: 1G
        memory: 128Mi
      requests:
        cpu: 100m
        ephemeral-storage: 1G
        memory: 128Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: *
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: *
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - emptyDir: {}
    name: log-volume
  - configMap:
      defaultMode: 420
      name: fluentbit-config
    name: fluentbit-config
  - emptyDir:
      medium: Memory
      sizeLimit: 8G
    name: shared-mem
  - name: *
    secret:
      defaultMode: *
      secretName: *
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-04-05T22:13:39Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-04-05T22:13:51Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-04-05T22:13:51Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-04-05T22:13:39Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: containerd://974a1d165914055879956729bbc490c34797c9fc6a143e68b66d669be680156c
    image: docker.io/fluent/fluent-bit:1.9.6
    imageID: docker.io/fluent/fluent-bit@sha256:dff47966c7c5f91fdfe44e94938db092902e0670302c853bd463d8e277756751
    lastState: {}
    name: fluentbit
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2023-04-05T22:13:40Z"
  - containerID: containerd://82956fb6e79e8af210726bce13a98eaaf9e6d3069c2ce1988650a32223d108a2
    image: docker.io/rayproject/ray:2.3.0-py39-cu113
    imageID: docker.io/rayproject/ray@sha256:c5d12e42896e384f508f187b0280e93871fba03465efc3289f1a08fc401671b2
    lastState: {}
    name: ray-head
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2023-04-05T22:13:40Z"
  hostIP: *
  phase: Running
  podIP: *
  podIPs:
  - ip: *
  qosClass: Guaranteed
  startTime: "2023-04-05T22:13:39Z"

worker node:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    cni.projectcalico.org/podIP: *
    cni.projectcalico.org/podIPs: *
    key: value
    ray.io/external-storage-namespace: *
    ray.io/ft-enabled: "true"
    ray.io/health-state: ""
  creationTimestamp: "2023-04-06T13:00:18Z"
  generateName: kuberay-cluster-worker-workergroup-
  labels:
    app.kubernetes.io/created-by: kuberay-operator
    app.kubernetes.io/instance: kuberay-cluster
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kuberay
    helm.sh/chart: ray-cluster-0.4.0
    ray.io/cluster: kuberay-cluster
    ray.io/cluster-dashboard: kuberay-cluster-dashboard
    ray.io/group: workergroup
    ray.io/identifier: kuberay-cluster-worker
    ray.io/is-ray-node: "yes"
    ray.io/node-type: worker
  name: kuberay-cluster-worker-workergroup-*
  namespace: kuberay
  ownerReferences:
  - apiVersion: ray.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: RayCluster
    name: kuberay-cluster
    uid: *
  resourceVersion: "276005140"
  uid: *
spec:
  affinity: {}
  containers:
  - args:
    - 'ulimit -n 65536; ray start  --dashboard-agent-listen-port=52365  --metrics-export-port=9001  --address=kuberay-cluster-head-svc:6379  --num-cpus=32  --memory=171798691840  --object-store-memory=75161927680  --block '
    command:
    - /bin/bash
    - -lc
    - --
    env:
    - name: RAY_ROTATION_BACKUP_COUNT
      value: "1"
    - name: RAY_ROTATION_MAX_BYTES
      value: "134217728"
    - name: RAY_GRAFANA_HOST
      value: *
    - name: RAY_IP
      value: kuberay-cluster-head-svc
    - name: RAY_PORT
      value: "6379"
    - name: RAY_ADDRESS
      value: kuberay-cluster-head-svc:6379
    - name: RAY_USAGE_STATS_KUBERAY_IN_USE
      value: "1"
    - name: REDIS_PASSWORD
    - name: *
      value: *
    image: rayproject/ray:2.3.0-py39-cu113
    imagePullPolicy: IfNotPresent
    lifecycle:
      preStop:
        exec:
          command:
          - /bin/sh
          - -c
          - ray stop
    livenessProbe:
      exec:
        command:
        - bash
        - -c
        - wget -T 2 -q -O- http://localhost:52365/api/local_raylet_healthz | grep
          success
      failureThreshold: 40
      initialDelaySeconds: 10
      periodSeconds: 3
      successThreshold: 1
      timeoutSeconds: 1
    name: ray-worker
    ports:
    - containerPort: 10001
      name: client
      protocol: TCP
    - containerPort: 8265
      name: dashboard
      protocol: TCP
    - containerPort: 8000
      name: ray-serve
      protocol: TCP
    - containerPort: 52365
      name: dashboard-agent
      protocol: TCP
    - containerPort: 6379
      name: redis
      protocol: TCP
    - containerPort: 9001
      name: http-metrics
      protocol: TCP
    - containerPort: 8080
      name: metrics
      protocol: TCP
    readinessProbe:
      exec:
        command:
        - bash
        - -c
        - wget -T 2 -q -O- http://localhost:52365/api/local_raylet_healthz | grep
          success
      failureThreshold: 20
      initialDelaySeconds: 10
      periodSeconds: 3
      successThreshold: 1
      timeoutSeconds: 1
    resources:
      limits:
        cpu: "32"
        ephemeral-storage: 500M
        memory: 160Gi
      requests:
        cpu: "32"
        ephemeral-storage: 500M
        memory: 160Gi
    securityContext: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /tmp/ray
      name: log-volume
    - mountPath: /fluent-bit/etc/fluent-bit.conf
      name: fluentbit-config
      subPath: fluent-bit.conf
    - mountPath: /dev/shm
      name: shared-mem
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: *
      readOnly: true
  - image: fluent/fluent-bit:1.9.6
    imagePullPolicy: IfNotPresent
    name: fluentbit
    resources:
      limits:
        cpu: 100m
        ephemeral-storage: 500M
        memory: 128Mi
      requests:
        cpu: 100m
        ephemeral-storage: 500M
        memory: 128Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: *
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  initContainers:
  - command:
    - sh
    - -c
    - until nslookup $RAY_IP.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local;
      do echo waiting for K8s Service $RAY_IP; sleep 2; done
    env:
    - name: RAY_IP
      value: kuberay-cluster-head-svc
    image: busybox:1.28
    imagePullPolicy: IfNotPresent
    name: init
    resources: {}
    securityContext: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: *
      readOnly: true
  nodeName: *
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - emptyDir: {}
    name: log-volume
  - configMap:
      defaultMode: 420
      name: fluentbit-config
    name: fluentbit-config
  - emptyDir:
      medium: Memory
      sizeLimit: 160Gi
    name: shared-mem
  - name: *
    secret:
      defaultMode: *
      secretName: *
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-04-06T13:00:20Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-04-06T13:00:31Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-04-06T13:00:31Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-04-06T13:00:18Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: containerd://83ed1acf040dfdba3007e542eb389e7a6ff7ee528267103739438216710960e9
    image: docker.io/fluent/fluent-bit:1.9.6
    imageID: cr.fluentbit.io/fluent/fluent-bit@sha256:dff47966c7c5f91fdfe44e94938db092902e0670302c853bd463d8e277756751
    lastState: {}
    name: fluentbit
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2023-04-06T13:00:20Z"
  - containerID: containerd://0960ad8f3ef8d6ee13d14e387e6d5224a61700e57be7254dcfeeed2bf0136af8
    image: docker.io/rayproject/ray:2.3.0-py39-cu113
    imageID: docker.io/rayproject/ray@sha256:6106b2a1b5a6276420be45df7bb5c913994735ad27a2e12d6348f151c552f780
    lastState: {}
    name: ray-worker
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2023-04-06T13:00:20Z"
  hostIP: *
  initContainerStatuses:
  - containerID: containerd://235baca680704e04578018d78c3cf2e88fa9dd37f42de5e209af96e9acd22518
    image: docker.io/library/busybox:1.28
    imageID: docker.io/library/busybox@sha256:141c253bc4c3fd0a201d32dc1f493bcf3fff003b6df416dea4f41046e0f37d47
    lastState: {}
    name: init
    ready: true
    restartCount: 0
    state:
      terminated:
        containerID: containerd://235baca680704e04578018d78c3cf2e88fa9dd37f42de5e209af96e9acd22518
        exitCode: 0
        finishedAt: "2023-04-06T13:00:19Z"
        reason: Completed
        startedAt: "2023-04-06T13:00:19Z"
  phase: Running
  podIP: *
  podIPs:
  - ip: *
  qosClass: Burstable
  startTime: "2023-04-06T13:00:18Z"

As a temporary solution it is possible to set env for worker container like:

containers:
  - name: ray-worker
       env:
       - name: RAY_automatic_object_spilling_enabled
         value: "false"
       - name: RAY_object_spilling_threshold
         value: "0.99"

This behavior (read RAY_ + config option) is defined for cpp ray process here: ray/ray_config.h at a89c8d3330c64ed149c31c3877740ed9c534cbb6 · ray-project/ray · GitHub

Full config can be found here: ray/ray_config_def.h at a89c8d3330c64ed149c31c3877740ed9c534cbb6 · ray-project/ray · GitHub

Test script:

@ray.remote
def init_test_array():
    return np.zeros((48000, 48000))

@ray.remote
class TestObj:
    def __init__(self):
        self.obj_1 = ray.get(init_test_array.remote())
        self.obj_2 = ray.get(init_test_array.remote())
        self.obj_3 = ray.get(init_test_array.remote())
        self.obj_4 = ray.get(init_test_array.remote())

    def test(self):
        print(len(self.obj_1))
        print(len(self.obj_2))
        print(len(self.obj_3))
        print(len(self.obj_4))
actor_handle = TestObj.remote()

Check:

ray memory --units MB

--- Aggregate object store stats across all nodes ---
Plasma memory usage 70312 MiB, 5 objects, 95.21% full, 0.0% needed
Objects consumed by Ray tasks: 70312 MiB.