Scaling down nodes with specific custom resources

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

ray.autoscaler.sdk.request_resources supports autoscaling on custom_resources through bundles (Programmatic Cluster Scaling — Ray 2.3.0) but scaling down just those specific nodes created in that request_resources is not supported. In order to scale the cluster down we have to put num_cpus=0 taking away all nodes. Is there a way to specifically scale down the nodes spawned through bundle arguments ?

This is very important as we want to segregate different use cases among different kind of nodes. The usage of one use-case should not spill over to nodes meant for other use-cases and once the specific use-case is done, we would like to tear down the respective nodes. This is very important for creating 3rd generation of ML architectures as mentioned in The Third Generation of Production ML Architectures | Anyscale

Also can we have a provision to specify a bundle repeatedly by saying this bundle repeated x number of times, instead of manually putting copies of a bundle in a list.