[ray][k8s] Large network transmit in ray head node

crystalww · December 1, 2020, 7:14am

Environment:

ray 1.0.0、python 3.6.9
k8s cluster

In order to use as much resources as possible, I tried these combinations, and didn’t run any code, but the network transmit in ray head node is so large (sent).

(1) each worker pod has 4 cpus (1119 cpus in total)

image1559×188 99.4 KB

head node: 15 CPUs, 16Gi Memory
worker node: 4 CPUs, 2Gi Memory, 276 replicas

image733×89 3.97 KB

(2) each worker pods has 6 cpus (1119 cpus in total)

image1633×200 41.7 KB

head node: 15 CPUs, 16Gi Memory
worker node: 6 CPUs, 4Gi Memory, 184 replicas

image732×88 3.51 KB

(3) each worker pods has 8 cpus (959 cpus in total)

image1562×178 100 KB

head node: 15 CPUs, 16Gi Memory
worker node: 8 CPUs, 6Gi Memory, 118 replicas

image742×94 3.52 KB

(4) each worker pods has 15 cpus (1200 cpus in total)

image1589×188 100 KB

head node: 15 CPUs, 16Gi Memory
worker node: 15 CPUs, 8Gi Memory, 79 replicas

So, I want to know:

why the head node sent so many packets per seconds? I didn’t run any code yet.
if I use nfs to share codes, how can I do to read code from local instead of distribute by GCS?

sangcho · December 1, 2020, 7:25am

Head node contains a process called gcs_server which basically has much interaction with other parts of the systems. One of examples is the heartbeat data. The head node receives heartbeats from all nodes and send the combined result to every other node, which can use O(N^2) size payload (receives heartbeat from N nodes and then send the combined N heartbeat to N nodes back).

sangcho · December 1, 2020, 7:26am

The received heartbeats are used by each node to run ray’s decentralized scheduler.

crystalww · December 1, 2020, 8:05am

Thanks to reply. If I want to utilize more physical resources. Should I create more ray clusters instead of one?

Alex · December 2, 2020, 3:36am

Hey @crystalww the size of heartbeats (and thus network/cpu usage on the head node) grows roughly quadratically w.r.t the number of nodes in the cluster . It looks like you’ve already discovered this, but the cluster will scale better if you use bigger nodes (people tend to use 32/64+ cpus/node on big clusters).

If you’re running big clusters it may also help to give your head node a few extra cpus (or just set num_cpus=0 on the head node). The head node has extra processes like GCS, so giving the head node some extra resources to focus on control plane operations tends to help too.

Weijun_Hong · December 22, 2021, 6:19am

I am considering scaling my cluster up to 1000 machines, each containing 36 cpus, but I’m afraid my head node’s bandwidth will be filled with the quadratically growing heartbeat packets, resulting in a congested network. Is there any good idea to deal with this?

Alex · December 22, 2021, 7:04am

Hey you may want to check out the deployment guide. Ray Deployment Guide — Ray v1.9.1

Weijun_Hong · December 22, 2021, 7:30am

Well I’ve read that, and it seems that the most useful way to solve it is to enlarge my head node’s bandwidth. I’ll try something else anyway, thx.

sangcho · January 4, 2022, 11:38am

Do you still have issues with recent ray versions? We’ve had some improvement in this area since 1.0 and plan to further improve it.

Topic		Replies	Views
Some questions about Ray on Kubernetes Ray Clusters	3	764	December 3, 2021
Ray k8s cluster, communication is slow Ray Core	15	1023	June 18, 2022
Ray Serve Pods Scheduling Failing Ray Serve	3	101	July 26, 2024
Pool in a Ray cluster is sending the same number of jobs to different nodes even though the nodes have different sizes/different number of CPUs Kubernetes	6	662	June 8, 2022
Ray on k8s, how to properly config head node Ray Clusters	4	889	June 24, 2022

[ray][k8s] Large network transmit in ray head node

Related topics