Client can't find Raylet address in Ray > 1.5

same error as well, ray 2.1. advice coming from older posts didn’t help (i.e. sleep delay, reduced # of resources)

running

Blockquote # number of nodes other than the head node
worker_num=$((SLURM_JOB_NUM_NODES - 1))
for ((i = 1; i <= worker_num; i++)); do
node_i=${nodes_array[$i]}
echo “Starting WORKER $i at $node_i”
this_node_ip=$(srun --nodes=1 --ntasks=1 -w “$node_i” hostname --ip-address)
srun --nodes=1 --ntasks=1 -w “$node_i”
ray start --address “$ip_head”
–node-ip-address=“$this_node_ip”
–num-cpus “${SLURM_CPUS_PER_TASK}” --block &
sleep 30
done

retrieving:

[2022-11-30 17:53:41,136 I 64548 64548] global_state_accessor.cc:357: This node has an IP address of xx.xx.xx.xx, while we can not find the matched Raylet address. This maybe come from when you connect the Ray cluster with a different IP address or connect a container.