[Autoscaler] Autoscaler on ray 1.3 with minikube does not scale down

asm582 · April 23, 2021, 2:55pm

Hello,

I started minikube with 24 CPUs and 24GB of ram.The cluster was started with 2 workers excluding head node with max of 12 workers set in example-full.yaml. I ran 1500 tasks on ray head node and I see that ray scaled up the nodes

ray.init(address=“auto” )
2021-04-23 07:21:56,873 INFO worker.py:655 – Connecting to existing Ray cluster at address: 172.17.0.6:6379
{‘node_ip_address’: ‘172.17.0.6’, ‘raylet_ip_address’: ‘172.17.0.6’, ‘redis_address’: ‘172.17.0.6:6379’, ‘object_store_address’: ‘/tmp/ray/session_2021-04-23_07-16-41_226731_155/sockets/plasma_store’, ‘raylet_socket_name’: ‘/tmp/ray/session_2021-04-23_07-16-41_226731_155/sockets/raylet’, ‘webui_url’: ‘0.0.0.0:8265’, ‘session_dir’: ‘/tmp/ray/session_2021-04-23_07-16-41_226731_155’, ‘metrics_export_port’: 52943, ‘node_id’: ‘40a5e6554a7026e0d029ad05ba02b430d9b8070d6f719ab850f5e58d’}

res = ray.get([f.remote() for _ in range(1500)])
(autoscaler +3m2s) Tip: use ray status to view detailed autoscaling status. To disable autoscaler event messages, you can set AUTOSCALER_EVENTS=0.
(autoscaler +3m2s) Adding 5 nodes of type worker_node.

I did exit from the head node and as I understand all the tasks will be purged and I was hoping that the workers would be scaled down too but I don’t see that happening:

NAME READY STATUS RESTARTS AGE
example-cluster-ray-head-gfvt6 1/1 Running 0 36m
example-cluster-ray-worker-4wcds 1/1 Running 0 29m
example-cluster-ray-worker-8kbcb 1/1 Running 0 29m
example-cluster-ray-worker-hkzkf 1/1 Running 0 29m
example-cluster-ray-worker-tkxm9 1/1 Running 0 29m
example-cluster-ray-worker-tsmbd 1/1 Running 0 29m
example-cluster-ray-worker-v6ffd 1/1 Running 0 35m
example-cluster-ray-worker-vppcr 1/1 Running 0 35m

asm582 · April 23, 2021, 4:32pm

I cannot reproduce this issue. maybe I did ray.init(num_cpus = 12) instead of ray.init(address= “auto”) and since the minikube cluster didn’t have enough resources it caused the autoscaler to break???

sven1977 · June 3, 2021, 2:32pm

Hey @asm582 , sorry for the long wait. This is due to your question not being categorized. Could you make sure that for any new posts you add a category (e.g. “Ray Core” or “RLlib”) to your question? This helps us assign the right person to respond more quickly.

Are you still seeing this problem or did this get resolved (you said, you “cannot reproduce”)?

Topic		Replies	Views
Autoscaling in minikube does not work? Kubernetes	3	484	April 18, 2021
Autoscaler failing on minikube Kubernetes	2	530	April 26, 2021
[Autoscaler] Autoscaler behavior for changes to min_workers for deployed cluster Ray Clusters	2	333	June 3, 2021
Scale up from 0 Ray Clusters	7	582	July 15, 2021
Testing autoscaler Kubernetes	15	1604	March 16, 2021

[Autoscaler] Autoscaler on ray 1.3 with minikube does not scale down

Related topics