Is there a way of getting the biggest machine available for auto scaling?
Imagine that I am generating a generic script that can be used by anyone’s with a Ray’s K8s cluster.
If I send a script, in hindsight I don’t know the environment it will be running, so I cannot hard code @ray.remote(num_cpus = 24)
because what if the place running the script biggest available machine for scaling is 23 cpus, that process will hang.
I went through the doc SDKs but couldn’t find anything other than resource request.
Pretty much, is there a way of doing:
ray.get_available_machines()
(even if not scaled just yet) and it returns eg a list:
[
machine1: {num_cpus = 32, resource = {customresource1: 1} },
machine2: {num_cpus = 16, resource = {customresource2: 1} }
]
i’d expect the autoscale.sdk
would have something but doesn’t seem so.