I set up a GCP instance to serve as the head node/cluster coordinator to execute ray tasks. The idea was to use local PCs scattered throughout the world as worker nodes. After starting the head note (regular
ray start --head), I try to add my local PC with
ray.init(address=<external_ip_of_gcp_instance>:6379, _redis_password=<password>), but the process hangs indefinitely at
-- Connecting to existing Ray cluster at address: <external_ip_of_gcp_instance>:6379. I tracked that the code gets stuck around
global_state.get_node_to_connect_for_driver(node_ip_address) (line 286 of
services.py). I did forward ports alright from the instance (inwards, TCP 6379). The same happens both on Windows and Linux.
I did also try to tunnel the [<internal_ip_of_gcp_instance>:]6379 port through ssh, passing then
localhost:6379 as the address, but then I got
redis.exceptions.ConnectionError: Error 10061.
If I do the same things to ports 8265 and 10001, I can access the dashboard and the client service either from the external_ip and from the ssh tunnel through localhost. It seems the issue is somehow related to the Redis server setup.
Any help to make this work would be much appreciated.