Hello, I have a problem with starting an autoscaling cluster on AWS.
I start the cluster with my YAML file, but then, no worker nodes are started. What I am doing wrong please?
Ray version 1.9.1 on WSL.
Python 3.7
# A unique identifier for the head node and workers of this cluster.
cluster_name: basic-ray3
# The maximum number of workers nodes to launch in addition to the head
# node. This takes precedence over min_workers. min_workers defaults to 0.
upscaling_speed: 5.0
max_workers: 12
#idle_timeout_minutes: 5
available_node_types:
ray.head.default:
resources: {"CPU": 4}
node_config:
InstanceType: m5.xlarge
KeyName: hoky-ray
#ImageId: latest_dlami
ray.worker.default:
resources: {"CPU": 4}
min_workers: 12
max_workers: 12
node_config:
InstanceType: m5.xlarge
KeyName: hoky-ray
#ImageId: latest_dlami
# Cloud-provider specific configuration.
provider:
type: aws
region: us-west-2
availability_zone: us-west-2a
# How Ray will authenticate with newly launched nodes.
auth:
ssh_user: ubuntu
ssh_private_key: ~/.ssh/hoky-ray.pem
setup_commands:
- pip install ray[all] # We won’t use pytorch.
# However, this and the following line demonstrate that you can specify arbitrary
# startup scripts on the cluster.
- pip install empyrical pandas==1.2.3 tqdm
file_mounts: {
"~": ".", # /mnt/c/Users/Hoky/ray
}
head_node_type: ray.head.default
worker_default_node_type: ray.worker.default
# Command to start ray on the head node. You don't need to change this.
head_start_ray_commands:
- ray stop
- ulimit -n 65536; ray start --head --port=6379 --object-manager-port=8076 --autoscaling-config=~/ray_bootstrap_config.yaml
# Command to start ray on worker nodes. You don't need to change this.
worker_start_ray_commands:
- ray stop
- ulimit -n 65536; ray start --address=$RAY_HEAD_IP:6379 --object-manager-port=8076
hoky@DESKTOP-P1L6T2J:/mnt/c/Users/Hoky/ray$ ray up cluster.yaml -y && ray submit cluster.yaml compute_remote.py
Cluster: basic-ray3
2021-12-28 22:28:40,196 INFO util.py:282 -- setting max workers for head node type to 0
Checking AWS environment settings
AWS config
IAM Profile: ray-autoscaler-v1 [default]
EC2 Key pair (all available node types): hoky-ray
VPC Subnets (all available node types): subnet-071195b1ea481cc82 [default]
EC2 Security groups (all available node types): sg-030fd1153af5ebf9d [default]
EC2 AMI (all available node types): ami-0a2363a9cff180a64 [dlami]
No head node found. Launching a new cluster. Confirm [y/N]: y [automatic, due to --yes]
Acquiring an up-to-date head node
Launched 1 nodes [subnet_id=subnet-071195b1ea481cc82]
Launched instance i-00186169f828c3616 [state=pending, info=pending]
Launched a new head node
Fetching the new head node
<1/1> Setting up head node
Prepared bootstrap config
New status: waiting-for-ssh
[1/7] Waiting for SSH to become available
Running `uptime` as a test.
Waiting for IP
Not yet available, retrying in 5 seconds
Received: 52.40.45.139
Thanks for the tips!