[Autoscaler] Does Ray still use autoscaler when starting the cluster manually?

michaelarman · May 21, 2021, 3:41pm

Probably a silly question or possibly answered before but can the autoscaler be enabled with a manual cluster launch?

rliaw · May 21, 2021, 9:47pm

Yeah, you can enable the autoscaler during a manual cluster launch.

Ray just runs:

ray start --head --port=6379 --object-manager-port=8076 --autoscaling-config=~/ray_bootstrap_config.yaml

on the head node of the cluster.

michaelarman · September 24, 2021, 8:16pm

Hey Richard, sorry to bug you on an old post but I’ve come across an error when I try to enable the autoscaler in a manual launch.
I don’t get any errors right away but when my program starts to run I get this

2021-09-24 15:11:56,168 WARNING worker.py:1059 -- The autoscaler failed with the following error:
Traceback (most recent call last):
  File "/root/anaconda3/envs/LarusTF/lib/python3.8/site-packages/ray/autoscaler/_private/monitor.py", line 284, in run
    self._initialize_autoscaler()
  File "/root/anaconda3/envs/LarusTF/lib/python3.8/site-packages/ray/autoscaler/_private/monitor.py", line 129, in _initialize_autoscaler
    self.autoscaler = StandardAutoscaler(
  File "/root/anaconda3/envs/LarusTF/lib/python3.8/site-packages/ray/autoscaler/_private/autoscaler.py", line 86, in __init__
    self.reset(errors_fatal=True)
  File "/root/anaconda3/envs/LarusTF/lib/python3.8/site-packages/ray/autoscaler/_private/autoscaler.py", line 537, in reset
    raise e
  File "/root/anaconda3/envs/LarusTF/lib/python3.8/site-packages/ray/autoscaler/_private/autoscaler.py", line 467, in reset
    with open(self.config_path) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/root/ray_bootstrap_config.yaml'

It seems to be looking for for the bootstrap file. I tried looking for one but couldn’t find it. Can you let me know if there’s a fix and what the bootstrap file is and where it can be located?

rliaw · September 24, 2021, 8:34pm

It should be the same autoscaling config (ray_bootstrap_config.yaml) that you use to launch or specify the autoscaling configuration!

michaelarman · September 24, 2021, 8:44pm

Ohhh so if i understand correctly I would use a yaml like this if running on an on-premise cluster ?

lshao-ts · September 24, 2021, 9:57pm

@rliaw I have a question regarding manually starting cluster too. Just asking here for convenience.

In the ray_bootstrap_config.yaml file, can I specifyhead_node_type to some type that isn’t in the available_node_types? The reason is because I want my tasks to all run remotely and autoscaler shouldn’t ever start a node with that head type. It seems working fine in that way, but when I call ray.autoscaler.sdk.request_resources, the code seems checking the resource requirement of the head node in the available_node_types dict. Is this expected? It then tries to launch nodes with that type, although my tasks didn’t run with resource requirements that could be satisfied with that type.

rliaw · September 24, 2021, 10:22pm

This is a question for @Dmitri!

Dmitri · September 25, 2021, 2:47am

Hi, @lshao-ts !

head_node_types has to be one of the available_node_types

It sounds like you might have run into a bug. Could you post a bug report on github with details like the autoscaling config, ray version, how you’re starting ray, etc?

lshao-ts · September 25, 2021, 3:02am

The bug is probably on my end as I’m still trying to figure out many details.

If head_node_type is in the available_node_types but not to be scheduled, do I fill out its resources to match what I specified when running ray start --head, such as CPU, GPU, memory?

Dmitri · September 25, 2021, 4:47am

If head_node_type is in the available_node_types but not to be scheduled, do I fill out its resources to match what I specified when running ray start --head , such as CPU, GPU, memory?

Probably that’s a good idea.

Could you share that autoscaling config you’re using?

Also, I should say that manually starting an autoscaling cluster with an autoscaling config is possible, but isn’t as well supported or documented as using ray up.

Topic		Replies	Views
Sample ray autoscaling config Ray Clusters	5	1117	June 4, 2021
Can I use ray autoscaler to control a manually launched ray cluster Kubernetes	3	574	July 15, 2021
Failed to connect after autoscaler restart Ray Clusters	4	853	May 18, 2023
Autoscaler on K8s import client error Kubernetes	8	1198	April 3, 2021
[Autoscaler] Autoscaler behavior for changes to min_workers for deployed cluster Ray Clusters	2	319	June 3, 2021

[Autoscaler] Does Ray still use autoscaler when starting the cluster manually?

Related topics