1. Severity of the issue: (select one)
 None: I’m just curious or want clarification.
 Low: Annoying but doesn’t hinder my work.
 Medium: Significantly affects my productivity but can find a workaround.
 High: Completely blocks me.
2. Environment:
- Ray version: 2.47.1
- Python version: 3.12.7
- OS: Ubuntu 24.04
3. What happened vs. what you expected:
- 
Expected: When starting the cluster using the config below, I expect Ray to SSH into each host (head and workers), run the Docker container, and execute the setup commands defined. I have also tried adding min_workers=2 and max_workers=2. 
- 
Actual: 
 Only the head node is initialized and set up. The worker nodes listed inworker_ipsare not contacted nor started. Terminal hangs after the head setup.
config.yaml :
cluster_name: rag-cluster
provider:
  type: local
  head_ip: 172.16.20.3
  worker_ips: [172.16.20.1, 172.16.20.2]  # Mandatory but does not automatically start the worker nodes for a local cluster
docker:
  image: .../myimage
  pull_before_run: true
  container_name: ray_node
  run_options:
    - --gpus all
    - -v /ray_mount/model_weights:/app/model_weights
    - -v /ray_mount/data:/app/data
    - -v /ray_mount/db:/app/db
    - -v /ray_mount/.hydra_config:/app/.hydra_config
    - -v /ray_mount/logs:/app/logs
    - --env-file /ray_mount/.env
auth:
  ssh_user: root
  ssh_private_key: lucie_chat_id_rsa
head_setup_commands:
  - bash /app/ray-cluster/start_head.sh
worker_setup_commands:
  - bash /app/ray-cluster/start_worker.sh
Logs Summary:
- ClusterStatecorrectly shows:- ['172.16.20.3', '172.16.20.2']
- Head node is launched, Docker image is pulled, setup commands run successfully
- Worker nodes are not triggered at all
- Terminal ends with Ray startup instructions for the head, no indication that workers are being set up
Full logs:
Cluster: rag-cluster
Checking Local environment settings
2025-07-22 17:40:29,616 INFO node_provider.py:53 – ClusterState: Loaded cluster state: [‘172.16.20.3’, ‘172.16.20.2’]
No head node found. Launching a new cluster. Confirm [y/N]: y [automatic, due to --yes]
Usage stats collection is enabled. To disable this, add --disable-usage-stats to the command that starts the cluster, or run the following command: ray disable-usage-stats before starting the cluster.
Acquiring an up-to-date head node
2025-07-22 17:40:29,616 INFO node_provider.py:114 – ClusterState: Writing cluster state: [‘172.16.20.3’, ‘172.16.20.2’]
Launched a new head node
Fetching the new head node
<1/1> Setting up head node
Prepared bootstrap config
2025-07-22 17:40:29,618 INFO node_provider.py:114 – ClusterState: Writing cluster state: [‘172.16.20.3’, ‘172.16.20.2’]
New status: waiting-for-ssh
[1/7] Waiting for SSH to become available
Running uptime as a test.
Fetched IP: 172.16.20.3
Warning: Permanently added ‘172.16.20.3’ (ED25519) to the list of known hosts.
17:40:30 up  9:35,  2 users,  load average: 0.20, 0.16, 0.11
Shared connection to 172.16.20.3 closed.
Success.
Updating cluster configuration. [hash=f8946df353933904aff5650b7fb3522bb7c132d7]
2025-07-22 17:40:30,141 INFO node_provider.py:114 – ClusterState: Writing cluster state: [‘172.16.20.3’, ‘172.16.20.2’]
New status: syncing-files
[2/7] Processing file mounts
Shared connection to 172.16.20.3 closed.
Shared connection to 172.16.20.3 closed.
[3/7] No worker file mounts to sync
2025-07-22 17:40:30,436 INFO node_provider.py:114 – ClusterState: Writing cluster state: [‘172.16.20.3’, ‘172.16.20.2’]
New status: setting-up
[4/7] No initialization commands to run.
[5/7] Initializing command runner
Shared connection to 172.16.20.3 closed.
Using default tag: latest
latest: Pulling from linagora/openrag-ray
Digest: sha256:cfbc0c67a16a6cd706afd011f7107b545b274da63050054cbcf403300658805c
Status: Image is up to date for …/myimage
Shared connection to 172.16.20.3 closed.
Shared connection to 172.16.20.3 closed.
Shared connection to 172.16.20.3 closed.
Shared connection to 172.16.20.3 closed.
Shared connection to 172.16.20.3 closed.
Tue Jul 22 17:40:31 2025
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.169                Driver Version: 570.169        CUDA Version: 12.8     |
|-----------------------------------------±-----------------------±---------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L4                      Off |   00000000:01:00.0 Off |                    0 |
| N/A   40C    P8             16W /   72W |       0MiB /  23034MiB |      0%      Default |
|                                         |                        |                  N/A |
±----------------------------------------±-----------------------±---------------------+
±----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
±----------------------------------------------------------------------------------------+
Shared connection to 172.16.20.3 closed.
7223a7d6027a0b182cf600da01cf7292abceee8d36daf9e92a39460f68a4c698
Shared connection to 172.16.20.3 closed.
Shared connection to 172.16.20.3 closed.
sending incremental file list
ray_bootstrap_config.yaml
sent 780 bytes  received 35 bytes  1,630.00 bytes/sec
total size is 1,348  speedup is 1.65
Shared connection to 172.16.20.3 closed.
Shared connection to 172.16.20.3 closed.
sending incremental file list
ray_bootstrap_key.pem
sent 2,121 bytes  received 35 bytes  4,312.00 bytes/sec
total size is 2,622  speedup is 1.22
Shared connection to 172.16.20.3 closed.
Shared connection to 172.16.20.3 closed.
[6/7] Running setup commands
(0/1) bash /app/ray-cluster/start_he…
Enable usage stats collection? This prompt will auto-proceed in 10 seconds to avoid blocking cluster startup. Confirm [Y/n]: Y
Usage stats collection is enabled. To disable this, add --disable-usage-stats to the command that starts the cluster, or run the following command: ray disable-usage-stats before starting the cluster.
Local node IP: 172.16.20.3
Ray runtime started.
Next steps
To add another node to this Ray cluster, run
ray start --address=‘172.16.20.3:6379’
To connect to this Ray cluster:
import ray
ray.init(_node_ip_address=‘172.16.20.3’)
To submit a Ray job using the Ray Jobs CLI:
RAY_ADDRESS=‘http://172.16.20.3:8265’ ray job submit --working-dir . – python my_script.py
for more information on submitting Ray jobs to the Ray cluster.
To terminate the Ray runtime, run
ray stop
To view the status of the cluster, use
ray status
To monitor and debug Ray, view the dashboard at
172.16.20.3:8265
If connection to the dashboard fails, check your firewall settings and network configuration.
- Terminal remains idle here after head setup; no activity or errors
 Sorry to hear about that, feel free to reach out to me on Slack if you’re still having issues.
 Sorry to hear about that, feel free to reach out to me on Slack if you’re still having issues.