Running ray cluster on vastai cloud

Chetan_Dhembre · August 8, 2025, 3:18pm

1. Severity of the issue: (select one)
High: Completely blocks me.

2. Environment:

Ray version: ray, version 2.48.0
Python version: Python 3.10.12
OS: Ubuntu 22:04
Cloud/Infrastructure: VastAI cloud
Other libs/tools (if relevant): using with vllm

3. What happened vs. what you expected:

Expected: I am trying to create head and worker node on two different instance hosted on vastai cloud.. using command mentioned in document
Actual: I am not able to get it working, worker is joining cluster but after sometime gcs server making worker dead because of health check failure, This is happening because weird networking which i will explain below

I have created two instances on vastai using template which opens 4 ports say A, B, C, D. Instances are docker/vm running on host, so that my VM port (i.e. A, B, C, D) forwarded to random ports which are different than A, B, C, D. And each instance will have different external port assigned for given same internal port.

if node manager port for my worker is internal port, head not is not able to connect during healthcheck because port is internal.

i am giving instance’s public ip in configuration so two instance can discover each other but can not connect on port.

So in summary, i am not able top run ray cluster on vms which are not in same network. And port configuration is not consistent.

I was wondering, if i am missing something.. i have not seen any guide about how get ray cluster working on vastai cloud anywhere, i also done lot of research using gpt5 and other llm to find answer.

following two command I am using to run head and worker

ray start --head \
   --port=$HOST_INTERNAL_PORT1

ray start --address="$HEAD_IP:$HOST_EXTERNAL_PORT1" \
        --node-manager-port=$WORKER_INTERNAL_PORT_2 \
        --node-manager-host=$WORKER_IP

Please help me, this is blocker in my project

Topic		Replies	Views
Unable to manually start ray cluster Ray Core	2	815	April 26, 2021
Fail to setup ray clusters from inter-connectable machines Ray Clusters	0	291	January 14, 2023
Connecting to remote Ray cluster on K8s Ray Clusters	7	2951	September 6, 2022
Having trouble connecting to head node Ray Clusters	14	6794	April 27, 2022
Ray Cluster tutorial on AWS cluster ran into IP issue unexpectedly Ray Clusters	0	490	May 22, 2022

Running ray cluster on vastai cloud

Related topics