Hello,
I started minikube with 24 CPUs and 24GB of ram.The cluster was started with 2 workers excluding head node with max of 12 workers set in example-full.yaml. I ran 1500 tasks on ray head node and I see that ray scaled up the nodes
ray.init(address=“auto” )
2021-04-23 07:21:56,873 INFO worker.py:655 – Connecting to existing Ray cluster at address: 172.17.0.6:6379
{‘node_ip_address’: ‘172.17.0.6’, ‘raylet_ip_address’: ‘172.17.0.6’, ‘redis_address’: ‘172.17.0.6:6379’, ‘object_store_address’: ‘/tmp/ray/session_2021-04-23_07-16-41_226731_155/sockets/plasma_store’, ‘raylet_socket_name’: ‘/tmp/ray/session_2021-04-23_07-16-41_226731_155/sockets/raylet’, ‘webui_url’: ‘0.0.0.0:8265’, ‘session_dir’: ‘/tmp/ray/session_2021-04-23_07-16-41_226731_155’, ‘metrics_export_port’: 52943, ‘node_id’: ‘40a5e6554a7026e0d029ad05ba02b430d9b8070d6f719ab850f5e58d’}
res = ray.get([f.remote() for _ in range(1500)])
(autoscaler +3m2s) Tip: useray status
to view detailed autoscaling status. To disable autoscaler event messages, you can set AUTOSCALER_EVENTS=0.
(autoscaler +3m2s) Adding 5 nodes of type worker_node.
I did exit from the head node and as I understand all the tasks will be purged and I was hoping that the workers would be scaled down too but I don’t see that happening:
NAME READY STATUS RESTARTS AGE
example-cluster-ray-head-gfvt6 1/1 Running 0 36m
example-cluster-ray-worker-4wcds 1/1 Running 0 29m
example-cluster-ray-worker-8kbcb 1/1 Running 0 29m
example-cluster-ray-worker-hkzkf 1/1 Running 0 29m
example-cluster-ray-worker-tkxm9 1/1 Running 0 29m
example-cluster-ray-worker-tsmbd 1/1 Running 0 29m
example-cluster-ray-worker-v6ffd 1/1 Running 0 35m
example-cluster-ray-worker-vppcr 1/1 Running 0 35m