Hi Team,
we are trying to run Ray on one of the containers in AWS ECS Fargate. We wanted to connect to the ray container from another container(rest).
Ray container started well. the below is the log
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:21,888 INFO usage_lib.py:516 β Usage stats collection is enabled by default without user confirmation because this terminal is detected to be non-interactive. To disable this, add --disable-usage-stats
to the command that starts the cluster, or run the following command: ray disable-usage-stats
before starting the cluster. See Usage Stats Collection β Ray 3.0.0.dev0 for more details.
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:21,888 INFO scripts.py:702 β Local node IP: 10.X.X.X
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,108 SUCC scripts.py:739 β --------------------
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,108 SUCC scripts.py:740 β Ray runtime started.
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,108 SUCC scripts.py:741 β --------------------
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,108 INFO scripts.py:743 β Next steps
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,108 INFO scripts.py:744 β To connect to this Ray runtime from another node, run
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,108 INFO scripts.py:747 β ray start --address=β10.186.166.231:6379β
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,108 INFO scripts.py:763 β Alternatively, use the following Python code:
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,108 INFO scripts.py:765 β import ray
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,109 INFO scripts.py:769 β ray.init(address=βautoβ)
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,109 INFO scripts.py:781 β To connect to this Ray runtime from outside of the cluster, for example to
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,109 INFO scripts.py:785 β connect to a remote cluster from your laptop directly, use the following
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,109 INFO scripts.py:789 β Python code:
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,109 INFO scripts.py:791 β import ray
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,109 INFO scripts.py:792 β ray.init(address=βray://<head_node_ip_address>:10001β)
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,109 INFO scripts.py:801 β To see the status of the cluster, use
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,109 INFO scripts.py:802 β ray status
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,109 INFO scripts.py:812 β If connection fails, check your firewall settings and network configuration.
2023-03-22T17:48:26.110+05:30 2023-03-22 12:18:26,109 INFO scripts.py:820 β To terminate the Ray runtime, run
The connection to the Ray container from rest container is the below code:-(getting the ips of the containers and initialising it)
ip_address=β10.X.X.Xβ
ray_address = βray://β + ip_address+β:10001β
print(ray_address)
try:
ray.init(address=ray_address)
print(βray initializedβ)
except Exception as e:
print(βray initialization failedβ,e)
when the above piece of code is run in rest container, we get the below error:
ray initialization failed ray client connection timeout
Not sure what is happening here. any suggestion or help is appreciated.
it works in local machine in containers but not on the ECS fargate.
Ray version 2.2