How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hello. I have the following cluster configuration:
cluster_name: distmat
provider:
type: local
head_ip: "192.168.128.129"
worker_ips: ["192.168.128.211", "192.168.128.212", "192.168.128.213", "192.168.128.214", "192.168.128.215", "192.168.128.221", "192.168.128.222", "192.168.128.223", "192.168.128.224", "192.168.128.225"]
auth:
ssh_user: <<user>>
upscaling_speed: 1.0
idle_timeout_minutes: 5
file_mounts_sync_continuously: False
rsync_exclude:
- "**/.git"
- "**/.git/**"
rsync_filter:
- ".gitignore"
head_start_ray_commands:
- ray stop
- ulimit -c unlimited && ray start --head --port=6379 --autoscaling-config=~/ray_bootstrap_config.yaml
worker_start_ray_commands:
- ray stop
- ray start --address=$RAY_HEAD_IP:6379
I’ve been facing an issue where running ray up cluster.yaml
, with or without the --no-config-cache
flag, or ray down cluster.yaml
, results in the command hanging indefinitely until I manually terminate it with Ctrl+C
. Strangely, this problem emerged unexpectedly, as the same setup used to function correctly without issues.
If you have any insights or solutions to this problem, I’d be glad if you’d share them.
Thanks