Greetings to everyone!
I am currently struggling to make Ray work in a private cluster wich is interconnected with IPv6-only network.
It seems that there is no way to let all the processes of the head node to bind to the ipv6 addresses, and communicate them over the cluster so that the rest of the nodes would work properly.
Being not very good at network engineering, I might have missed some obvious ways to make Ray use local ipv4 addresses working on top of ipv6 (like tunneling etc). But from a devops standpoint, such solutions seems to have some extra cost of non-trivial network configuration, and it would be much better to just make code ipv6-ready.
Brief exploration of a code base shows that the core networking is in boost::asio and python sockets, meaning that it definitely supports IPv6. But most of the control plane logic is written in a way that it either tries to resolve hostnames using socket.gethostbyname
instead of socket.getaddrinfo
, or expects addresses to contain host and port separated with a signle ‘:’ digit, thus using code like address.split(':')
extensively.
I am wondering, does anybody had the same problems and found a workaround?
Or maybe it is known issue, and someone is working on it already?