Scaling down nodes with specific custom resources

vgill · March 23, 2023, 4:35am

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

ray.autoscaler.sdk.request_resources supports autoscaling on custom_resources through bundles (Programmatic Cluster Scaling — Ray 2.3.0) but scaling down just those specific nodes created in that request_resources is not supported. In order to scale the cluster down we have to put num_cpus=0 taking away all nodes. Is there a way to specifically scale down the nodes spawned through bundle arguments ?

This is very important as we want to segregate different use cases among different kind of nodes. The usage of one use-case should not spill over to nodes meant for other use-cases and once the specific use-case is done, we would like to tear down the respective nodes. This is very important for creating 3rd generation of ML architectures as mentioned in The Third Generation of Production ML Architectures | Anyscale

vgill · March 23, 2023, 4:36am

Also can we have a provision to specify a bundle repeatedly by saying this bundle repeated x number of times, instead of manually putting copies of a bundle in a list.

Topic		Replies	Views
Autoscale on custom private cloud Ray Clusters	1	372	December 25, 2021
Can I use ray autoscaler to control a manually launched ray cluster Kubernetes	3	561	July 15, 2021
Autoscaling not working with ray.util.multiprocessing Kubernetes	5	742	June 17, 2021
[Autoscaler] Sharded Autoscaler Ray cluster	10	440	October 14, 2022
Autoscaler doesn't scale workers on K8s	5	669	February 15, 2021

Scaling down nodes with specific custom resources

Related topics