Lack of IPv6 support

Greetings to everyone!

I am currently struggling to make Ray work in a private cluster wich is interconnected with IPv6-only network.

It seems that there is no way to let all the processes of the head node to bind to the ipv6 addresses, and communicate them over the cluster so that the rest of the nodes would work properly.

Being not very good at network engineering, I might have missed some obvious ways to make Ray use local ipv4 addresses working on top of ipv6 (like tunneling etc). But from a devops standpoint, such solutions seems to have some extra cost of non-trivial network configuration, and it would be much better to just make code ipv6-ready.

Brief exploration of a code base shows that the core networking is in boost::asio and python sockets, meaning that it definitely supports IPv6. But most of the control plane logic is written in a way that it either tries to resolve hostnames using socket.gethostbyname instead of socket.getaddrinfo, or expects addresses to contain host and port separated with a signle ‘:’ digit, thus using code like address.split(':') extensively.

I am wondering, does anybody had the same problems and found a workaround?
Or maybe it is known issue, and someone is working on it already?

I think you are the first to report this issue. I find it kind of exciting that you’ve already found the solution too. Do you think that you could contribute the code which would make the Ray control plane compatible with IP v6?

I’m not sure fixing the issue would be as easy as locating it, but I would be glad to contribute the solution back as soon as I had one (if at all).

1 Like

Based on this discussion I filed [Core] Ray IPv6 support by samrocketman · Pull Request #44252 · ray-project/ray · GitHub