[Autoscaler] Does Ray still use autoscaler when starting the cluster manually?

Probably a silly question or possibly answered before but can the autoscaler be enabled with a manual cluster launch?

Yeah, you can enable the autoscaler during a manual cluster launch.

Ray just runs:

ray start --head --port=6379 --object-manager-port=8076 --autoscaling-config=~/ray_bootstrap_config.yaml

on the head node of the cluster.

Hey Richard, sorry to bug you on an old post but I’ve come across an error when I try to enable the autoscaler in a manual launch.
I don’t get any errors right away but when my program starts to run I get this

2021-09-24 15:11:56,168 WARNING worker.py:1059 -- The autoscaler failed with the following error:
Traceback (most recent call last):
  File "/root/anaconda3/envs/LarusTF/lib/python3.8/site-packages/ray/autoscaler/_private/monitor.py", line 284, in run
    self._initialize_autoscaler()
  File "/root/anaconda3/envs/LarusTF/lib/python3.8/site-packages/ray/autoscaler/_private/monitor.py", line 129, in _initialize_autoscaler
    self.autoscaler = StandardAutoscaler(
  File "/root/anaconda3/envs/LarusTF/lib/python3.8/site-packages/ray/autoscaler/_private/autoscaler.py", line 86, in __init__
    self.reset(errors_fatal=True)
  File "/root/anaconda3/envs/LarusTF/lib/python3.8/site-packages/ray/autoscaler/_private/autoscaler.py", line 537, in reset
    raise e
  File "/root/anaconda3/envs/LarusTF/lib/python3.8/site-packages/ray/autoscaler/_private/autoscaler.py", line 467, in reset
    with open(self.config_path) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/root/ray_bootstrap_config.yaml'

It seems to be looking for for the bootstrap file. I tried looking for one but couldn’t find it. Can you let me know if there’s a fix and what the bootstrap file is and where it can be located?

It should be the same autoscaling config (ray_bootstrap_config.yaml) that you use to launch or specify the autoscaling configuration!

Ohhh so if i understand correctly I would use a yaml like this if running on an on-premise cluster ?

@rliaw I have a question regarding manually starting cluster too. Just asking here for convenience.

In the ray_bootstrap_config.yaml file, can I specifyhead_node_type to some type that isn’t in the available_node_types? The reason is because I want my tasks to all run remotely and autoscaler shouldn’t ever start a node with that head type. It seems working fine in that way, but when I call ray.autoscaler.sdk.request_resources, the code seems checking the resource requirement of the head node in the available_node_types dict. Is this expected? It then tries to launch nodes with that type, although my tasks didn’t run with resource requirements that could be satisfied with that type.

This is a question for @Dmitri!

Hi, @lshao-ts !

head_node_types has to be one of the available_node_types

It sounds like you might have run into a bug. Could you post a bug report on github with details like the autoscaling config, ray version, how you’re starting ray, etc?

The bug is probably on my end as I’m still trying to figure out many details.

If head_node_type is in the available_node_types but not to be scheduled, do I fill out its resources to match what I specified when running ray start --head, such as CPU, GPU, memory?

If head_node_type is in the available_node_types but not to be scheduled, do I fill out its resources to match what I specified when running ray start --head , such as CPU, GPU, memory?

Probably that’s a good idea.

Could you share that autoscaling config you’re using?

Also, I should say that manually starting an autoscaling cluster with an autoscaling config is possible, but isn’t as well supported or documented as using ray up.