Autoscaling behavior

i’d just have a quick q re the autoscaling behavior described here: A Glimpse into the Ray Autoscaler by Ameer Haj Ali - YouTube

There you’re saying that the autoscaler would calculate how much resources would be needed exactly for running a particular function. If i setup a ray cluster with 0 workers and i only run a function like you’re describing there — should i expect to see an instantaneous provisioning of just enough machines to provide 6 CPUs ?

Re the time it takes to bring up the machines – i guess this is all gated only by how fast the underlying IaaS can bring up VMs, right?

Besides that, which role (if any) does the ratio parameter play here? Can it limit the number of VMs getting provisioned concurrently? E.g. if i have a loop of 1000, will the autoscaler request 1000 VMs right away as soon as the control flow enters the loop?


Yes. But it takes time to bring up those machines. Also, if the headnode can fit the 6 CPUs it will not start any worker nodes.

Keep in mind that the scaling is exponential with the number of running nodes. so if 10 nodes are running, it will add another 10, then another 20, then another 40, etc…
If you want this to go faster, you need to specify upscaling_speed:99999 in the cluster yaml and then it will scale up instantly.

more info on upscaling_speed: Cluster YAML Configuration Options — Ray v2.0.0.dev0