Hi there!
I’m trying to apply system settings like '{"object_spilling_threshold": 0.99}'
to ray-worker.
If it is stored in the yaml for ray-worker in the rayStartParams
section, it is raised:
ValueError: System config parameters can only be set on the head node.
If the setting exists for the head, for example:
rayStartParams:
dashboard-agent-listen-port: "52365"
metrics-export-port: "9001"
...
system-config: "'{\"object_spilling_threshold\":0.99}'"
The worker does not apply system-config parameters.
The question is: how to apply system-config parameters on ray workers?
To be more clear, I am trying to set up a cluster in k8s with 1 head node and multiple worker nodes.
The Best practices for deploying large clusters — Ray 2.3.1 recommends to avoid scheduling additional tasks on head node.
So, my application needs to store in memory constantly learning weights that are passed to serve actors via shm. To this end, I allocate enough memory for object storage on the worker node. But the raylet starts the splitting process instantly as soon as the object storage capacity crosses 0.8 full threshold.
This situation is very inconvenient for serving my models, since I need to keep +20% of the memory, which will be almost free all the time.
I hope I miss something in the docs.
Configuration details:
Head node:
apiVersion: v1
kind: Pod
metadata:
annotations:
cni.projectcalico.org/podIP: *
cni.projectcalico.org/podIPs: *
ray.io/external-storage-namespace: *
ray.io/ft-enabled: "true"
ray.io/health-state: ""
creationTimestamp: "2023-04-05T22:13:39Z"
generateName: kuberay-cluster-head-
labels:
app.kubernetes.io/created-by: kuberay-operator
app.kubernetes.io/instance: kuberay-cluster
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: kuberay
helm.sh/chart: ray-cluster-0.4.0
ray.io/cluster: kuberay-cluster
ray.io/cluster-dashboard: kuberay-cluster-dashboard
ray.io/group: headgroup
ray.io/identifier: kuberay-cluster-head
ray.io/is-ray-node: "yes"
ray.io/node-type: head
name: kuberay-cluster-head-zmvvx
namespace: kuberay
ownerReferences:
- apiVersion: ray.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: RayCluster
name: kuberay-cluster
uid: *
resourceVersion: "275448686"
uid: *
spec:
affinity: {}
containers:
- args:
- 'ulimit -n 65536; ray start --head --system-config=''{"object_spilling_threshold":0.99}'' --block --memory=8000000000 --dashboard-agent-listen-port=52365 --dashboard-host=0.0.0.0 --metrics-export-port=9001 --num-cpus=0 '
command:
- /bin/bash
- -lc
- --
env:
- name: RAY_ROTATION_BACKUP_COUNT
value: "1"
- name: RAY_ROTATION_MAX_BYTES
value: "134217728"
- name: RAY_GRAFANA_HOST
value: 8
- name: RAY_REDIS_ADDRESS
value: *
- name: RAY_IP
value: 127.0.0.1
- name: RAY_PORT
value: "6379"
- name: RAY_ADDRESS
value: 127.0.0.1:6379
- name: RAY_USAGE_STATS_KUBERAY_IN_USE
value: "1"
- name: REDIS_PASSWORD
- name: RAY_external_storage_namespace
value: *
image: rayproject/ray:2.3.0-py39-cu113
imagePullPolicy: IfNotPresent
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- ray stop
livenessProbe:
exec:
command:
- bash
- -c
- wget -T 2 -q -O- http://localhost:52365/api/local_raylet_healthz | grep
success
- '&&'
- bash
- -c
- wget -T 2 -q -O- http://localhost:8265/api/gcs_healthz | grep success
failureThreshold: 40
initialDelaySeconds: 10
periodSeconds: 3
successThreshold: 1
timeoutSeconds: 1
name: ray-head
ports:
- containerPort: 10001
name: client
protocol: TCP
- containerPort: 8265
name: dashboard
protocol: TCP
- containerPort: 8000
name: ray-serve
protocol: TCP
- containerPort: 52365
name: dashboard-agent
protocol: TCP
- containerPort: 6379
name: redis
protocol: TCP
- containerPort: 9001
name: http-metrics
protocol: TCP
- containerPort: 8080
name: metrics
protocol: TCP
readinessProbe:
exec:
command:
- bash
- -c
- wget -T 2 -q -O- http://localhost:52365/api/local_raylet_healthz | grep
success
- '&&'
- bash
- -c
- wget -T 2 -q -O- http://localhost:8265/api/gcs_healthz | grep success
failureThreshold: 20
initialDelaySeconds: 10
periodSeconds: 3
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: "4"
ephemeral-storage: 1G
memory: 8G
requests:
cpu: "4"
ephemeral-storage: 1G
memory: 8G
securityContext: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /tmp/ray
name: log-volume
- mountPath: /fluent-bit/etc/fluent-bit.conf
name: fluentbit-config
subPath: fluent-bit.conf
- mountPath: /dev/shm
name: shared-mem
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: *
readOnly: true
- image: fluent/fluent-bit:1.9.6
imagePullPolicy: IfNotPresent
name: fluentbit
resources:
limits:
cpu: 100m
ephemeral-storage: 1G
memory: 128Mi
requests:
cpu: 100m
ephemeral-storage: 1G
memory: 128Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: *
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: *
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- emptyDir: {}
name: log-volume
- configMap:
defaultMode: 420
name: fluentbit-config
name: fluentbit-config
- emptyDir:
medium: Memory
sizeLimit: 8G
name: shared-mem
- name: *
secret:
defaultMode: *
secretName: *
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2023-04-05T22:13:39Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2023-04-05T22:13:51Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2023-04-05T22:13:51Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2023-04-05T22:13:39Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: containerd://974a1d165914055879956729bbc490c34797c9fc6a143e68b66d669be680156c
image: docker.io/fluent/fluent-bit:1.9.6
imageID: docker.io/fluent/fluent-bit@sha256:dff47966c7c5f91fdfe44e94938db092902e0670302c853bd463d8e277756751
lastState: {}
name: fluentbit
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2023-04-05T22:13:40Z"
- containerID: containerd://82956fb6e79e8af210726bce13a98eaaf9e6d3069c2ce1988650a32223d108a2
image: docker.io/rayproject/ray:2.3.0-py39-cu113
imageID: docker.io/rayproject/ray@sha256:c5d12e42896e384f508f187b0280e93871fba03465efc3289f1a08fc401671b2
lastState: {}
name: ray-head
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2023-04-05T22:13:40Z"
hostIP: *
phase: Running
podIP: *
podIPs:
- ip: *
qosClass: Guaranteed
startTime: "2023-04-05T22:13:39Z"
worker node:
apiVersion: v1
kind: Pod
metadata:
annotations:
cni.projectcalico.org/podIP: *
cni.projectcalico.org/podIPs: *
key: value
ray.io/external-storage-namespace: *
ray.io/ft-enabled: "true"
ray.io/health-state: ""
creationTimestamp: "2023-04-06T13:00:18Z"
generateName: kuberay-cluster-worker-workergroup-
labels:
app.kubernetes.io/created-by: kuberay-operator
app.kubernetes.io/instance: kuberay-cluster
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: kuberay
helm.sh/chart: ray-cluster-0.4.0
ray.io/cluster: kuberay-cluster
ray.io/cluster-dashboard: kuberay-cluster-dashboard
ray.io/group: workergroup
ray.io/identifier: kuberay-cluster-worker
ray.io/is-ray-node: "yes"
ray.io/node-type: worker
name: kuberay-cluster-worker-workergroup-*
namespace: kuberay
ownerReferences:
- apiVersion: ray.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: RayCluster
name: kuberay-cluster
uid: *
resourceVersion: "276005140"
uid: *
spec:
affinity: {}
containers:
- args:
- 'ulimit -n 65536; ray start --dashboard-agent-listen-port=52365 --metrics-export-port=9001 --address=kuberay-cluster-head-svc:6379 --num-cpus=32 --memory=171798691840 --object-store-memory=75161927680 --block '
command:
- /bin/bash
- -lc
- --
env:
- name: RAY_ROTATION_BACKUP_COUNT
value: "1"
- name: RAY_ROTATION_MAX_BYTES
value: "134217728"
- name: RAY_GRAFANA_HOST
value: *
- name: RAY_IP
value: kuberay-cluster-head-svc
- name: RAY_PORT
value: "6379"
- name: RAY_ADDRESS
value: kuberay-cluster-head-svc:6379
- name: RAY_USAGE_STATS_KUBERAY_IN_USE
value: "1"
- name: REDIS_PASSWORD
- name: *
value: *
image: rayproject/ray:2.3.0-py39-cu113
imagePullPolicy: IfNotPresent
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- ray stop
livenessProbe:
exec:
command:
- bash
- -c
- wget -T 2 -q -O- http://localhost:52365/api/local_raylet_healthz | grep
success
failureThreshold: 40
initialDelaySeconds: 10
periodSeconds: 3
successThreshold: 1
timeoutSeconds: 1
name: ray-worker
ports:
- containerPort: 10001
name: client
protocol: TCP
- containerPort: 8265
name: dashboard
protocol: TCP
- containerPort: 8000
name: ray-serve
protocol: TCP
- containerPort: 52365
name: dashboard-agent
protocol: TCP
- containerPort: 6379
name: redis
protocol: TCP
- containerPort: 9001
name: http-metrics
protocol: TCP
- containerPort: 8080
name: metrics
protocol: TCP
readinessProbe:
exec:
command:
- bash
- -c
- wget -T 2 -q -O- http://localhost:52365/api/local_raylet_healthz | grep
success
failureThreshold: 20
initialDelaySeconds: 10
periodSeconds: 3
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: "32"
ephemeral-storage: 500M
memory: 160Gi
requests:
cpu: "32"
ephemeral-storage: 500M
memory: 160Gi
securityContext: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /tmp/ray
name: log-volume
- mountPath: /fluent-bit/etc/fluent-bit.conf
name: fluentbit-config
subPath: fluent-bit.conf
- mountPath: /dev/shm
name: shared-mem
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: *
readOnly: true
- image: fluent/fluent-bit:1.9.6
imagePullPolicy: IfNotPresent
name: fluentbit
resources:
limits:
cpu: 100m
ephemeral-storage: 500M
memory: 128Mi
requests:
cpu: 100m
ephemeral-storage: 500M
memory: 128Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: *
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
initContainers:
- command:
- sh
- -c
- until nslookup $RAY_IP.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local;
do echo waiting for K8s Service $RAY_IP; sleep 2; done
env:
- name: RAY_IP
value: kuberay-cluster-head-svc
image: busybox:1.28
imagePullPolicy: IfNotPresent
name: init
resources: {}
securityContext: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: *
readOnly: true
nodeName: *
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- emptyDir: {}
name: log-volume
- configMap:
defaultMode: 420
name: fluentbit-config
name: fluentbit-config
- emptyDir:
medium: Memory
sizeLimit: 160Gi
name: shared-mem
- name: *
secret:
defaultMode: *
secretName: *
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2023-04-06T13:00:20Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2023-04-06T13:00:31Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2023-04-06T13:00:31Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2023-04-06T13:00:18Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: containerd://83ed1acf040dfdba3007e542eb389e7a6ff7ee528267103739438216710960e9
image: docker.io/fluent/fluent-bit:1.9.6
imageID: cr.fluentbit.io/fluent/fluent-bit@sha256:dff47966c7c5f91fdfe44e94938db092902e0670302c853bd463d8e277756751
lastState: {}
name: fluentbit
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2023-04-06T13:00:20Z"
- containerID: containerd://0960ad8f3ef8d6ee13d14e387e6d5224a61700e57be7254dcfeeed2bf0136af8
image: docker.io/rayproject/ray:2.3.0-py39-cu113
imageID: docker.io/rayproject/ray@sha256:6106b2a1b5a6276420be45df7bb5c913994735ad27a2e12d6348f151c552f780
lastState: {}
name: ray-worker
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2023-04-06T13:00:20Z"
hostIP: *
initContainerStatuses:
- containerID: containerd://235baca680704e04578018d78c3cf2e88fa9dd37f42de5e209af96e9acd22518
image: docker.io/library/busybox:1.28
imageID: docker.io/library/busybox@sha256:141c253bc4c3fd0a201d32dc1f493bcf3fff003b6df416dea4f41046e0f37d47
lastState: {}
name: init
ready: true
restartCount: 0
state:
terminated:
containerID: containerd://235baca680704e04578018d78c3cf2e88fa9dd37f42de5e209af96e9acd22518
exitCode: 0
finishedAt: "2023-04-06T13:00:19Z"
reason: Completed
startedAt: "2023-04-06T13:00:19Z"
phase: Running
podIP: *
podIPs:
- ip: *
qosClass: Burstable
startTime: "2023-04-06T13:00:18Z"
As a temporary solution it is possible to set env for worker container like:
containers:
- name: ray-worker
env:
- name: RAY_automatic_object_spilling_enabled
value: "false"
- name: RAY_object_spilling_threshold
value: "0.99"
This behavior (read RAY_
+ config option
) is defined for cpp ray process here: ray/ray_config.h at a89c8d3330c64ed149c31c3877740ed9c534cbb6 · ray-project/ray · GitHub
Full config can be found here: ray/ray_config_def.h at a89c8d3330c64ed149c31c3877740ed9c534cbb6 · ray-project/ray · GitHub
Test script:
@ray.remote
def init_test_array():
return np.zeros((48000, 48000))
@ray.remote
class TestObj:
def __init__(self):
self.obj_1 = ray.get(init_test_array.remote())
self.obj_2 = ray.get(init_test_array.remote())
self.obj_3 = ray.get(init_test_array.remote())
self.obj_4 = ray.get(init_test_array.remote())
def test(self):
print(len(self.obj_1))
print(len(self.obj_2))
print(len(self.obj_3))
print(len(self.obj_4))
actor_handle = TestObj.remote()
Check:
ray memory --units MB
--- Aggregate object store stats across all nodes ---
Plasma memory usage 70312 MiB, 5 objects, 95.21% full, 0.0% needed
Objects consumed by Ray tasks: 70312 MiB.