Ray Init causes error in the docker container due to non-mapped port

I have installed python 3.7.9 and ray inside a docker container. I am trying to connect through docker to a different machine and run a python program with ray on it.

Command used to run docker :-

sudo docker run -dit -p 3000:3000 -p 8265:8265 -p 6374:6374 -p 6379:6379 -p 6380:6380 -p 8000:8000 -p 8888:8888 -p 50050:50050 -p 50051:50051 -p 4000:4000 -p 4001:4001 -p 4002:4002 -p 4003:4003 -p 4040:4040 -p 10000-10200:10000-10200 -p 10201-10300:10201-10300 -p 37280:37280 -p 36458:36458 -p 38251:38251 -p 41091:41091 -p 44217:44217 -p 55711:55711 -p 58331:58331 -p 63084:63084 -p 63246:63246 -p 57454:57454 -p 63313:63313 -p 60504:60504 --shm-size=204.89gb --name nikunjJUL20 pyray tail -f /dev/null

I have mapped all the ports in the docker which I thought necessary for running ray.

I am making one machine a head node with 0 worker nodes

Command used to run ray :-

> ray start --head --node-ip-address --port 10275 --dashboard-host 0.0.0.0 --dashboard-port 8265 --object-manager-port 4000 --node-manager-port 4001 --min-worker-port 10002 --max-worker-port 10042 --ray-client-server-port 10250 --gcs-server-port 4003 --num-cpus 0 --redis-shard-ports 10201

The ray starts as following

> Local node IP:
> 2021-07-21 12:12:52,653 INFO services.py:1274 – View the Ray dashboard at http://172.17.0.2:8265
**> **
> --------------------
> Ray runtime started.
> --------------------
**> **
> Next steps
> To connect to this Ray runtime from another node, run
> ray start --address=‘<ip of the machine :10275’ --redis-password=‘5241590000000000’
**> **
> Alternatively, use the following Python code:
> import ray
> ray.init(address=‘auto’, _redis_password=‘5241590000000000’)
**> **
> If connection fails, check your firewall settings and network configuration.
**> **
> To terminate the Ray runtime, run
> ray stop

But as I checked the dashboard is not opening as its not able to get a node properly and the ray.init() is also not working. When I run the ray on the other machine and try to attach to this cluster then It also gets fail.

When I run following command, I get the error as follows

> ray debug

2021-07-21 12:22:53,160 INFO scripts.py:206 – Connecting to Ray instance at 172.23.10.111:10275.
2021-07-21 12:22:53,161 INFO worker.py:736 – Connecting to existing Ray cluster at address: 172.23.10.111:10275
Active breakpoints:
Enter breakpoint index or press enter to refresh: 2021-07-21 12:22:54,889 WARNING worker.py:1123 – The agent on node 648518b3d714 failed with the following error:
Traceback (most recent call last):
File “/usr/local/lib/python3.7/site-packages/ray/new_dashboard/agent.py”, line 326, in
loop.run_until_complete(agent.run())
File “/usr/local/lib/python3.7/asyncio/base_events.py”, line 587, in run_until_complete
return future.result()
File “/usr/local/lib/python3.7/site-packages/ray/new_dashboard/agent.py”, line 161, in run
await site.start()
File “/usr/local/lib/python3.7/site-packages/aiohttp/web_runner.py”, line 128, in start
reuse_port=self._reuse_port,
File “/usr/local/lib/python3.7/asyncio/base_events.py”, line 1389, in create_server
% (sa, err.strerror.lower())) from None
OSError: [Errno 99] error while attempting to bind on address (‘IP of the machine’, 0): cannot assign requested address

As I can see its getting a random port for some node but I think I have not made that available. I cannot map all the ports of the host to the docker. Please help me solve this issue.

When you run ray start, can you specify --node-ip-address to be equivalent to the docker container IP?

Also, is it possible for you to access Redis from your host machine? (maybe the Redis inside a docker container is not reachable from outside)

Hello Sangcho,
If I pass the --node-ip-address as docker container address then how will I be able to connect this ray to another ray on some other machine. I will try if I am able to access the Redis. Actually I think it is some problem of port which ray is not able to get while starting the process. I have started an issue on its github project as well.
Thank you for your reply.

Can you give me the url to the issue? Thanks for creating the issue btw! :slight_smile:

Here is the link to the issue :- Dashboard Agent Fails to Start in Docker Containers · Issue #17390 · ray-project/ray · GitHub

1 Like

Seems like it is followed now. Please ping me if it doesn’t have progress!

Okay thank you for checking :slight_smile: