Hello! I did a bit of searching on the docs and I think this is what’s going on here.
The ray monitor and ray status commands expect a direct address to the Ray cluster, typically in the format of ip:port (e.g., 10.0.0.1:6379 ), rather than a URL with a protocol like http://(ray job submit or ray summary are designed to listen to https:// endpoints, which is why they don’t throw an error).
For the commands where there is an error, they are expecting a different format, such as 10.212.131.238:8265 instead of http://10.212.131.238:8265/. This should resolve the issue for commands like ray monitor and ray status.
So, I think you can try using https:// for any dashboard commands, and the <ip>:<port> format for any cluster interactions. Let me know if this fixes the issue for you.
Thanks for pointing me in the right direction Christina!
I had to allow connections on port 6379 for ray status --address 10.212.154.239:6379 to work
Curiously, I also had to allow connections on port 41823 for ray memory --address 10.212.154.239:6379 to work. However, it seems this port is chosen randomly at start, so after restarting my cluster the command failed again. Do you know perhaps which of the ray start options mentioned here would control this port so I can make it constant?
I should say it’s a rather unexpected and impractical behaviour that ray status and ray job submit expect address in a different format even though they will both use the address from the same environment variable if not given on command line. This makes it impossible to avoid the --address option for a subset of commands.
Is someone finds that helpful, as an alternative to specifying --address, I noticed all of these commands work without --address option or RAY_ADDRESS env var when run on the head node, which can be done from outside the cluster via ray exec, for example ray exec config.yaml 'ray status'. Some may find this more convenient than having to remember what to pass as --address.
Thank you for your help finding workarounds to my troubles!
Hi! I’m glad to hear you solved the port problems!
As for the ray start options, I think these would help make the port more constant:
--min-worker-port: Minimum port number worker can be bound to. Default: 10002.
--max-worker-port: Maximum port number worker can be bound to. Default: 19999.
So, for example, this will allow you to set a specific range of ports that Ray can use, making it easier to manage and predict which ports need to be open. Here is how you can specify these options:
bash
ray start --head --port=6379 --min-worker-port=10000 --max-worker-port=10010
This will limit the worker ports to the range 10000-10010, and you can open these ports in your firewall settings. So if you want to limit the ports to like 41823, I guess you can set the min and max to the same number?