Cannot create a Ray Worker on WSL

andrferreira · August 16, 2024, 4:23pm

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

Dear all,

I am new to Ray but I have been playing with it for the last 2 weeks. I have some tasks that I want to parallelize among the servers of my company and Ray looks a great tool to do it. The only issue is that our best servers are running Windows and from my research I realized that Ray is not ready to run distributed tasks among different Windows computers yet.

So, to take benefit from the hardware I have on those servers I was thinking about using WSL (Windows Subsystem for Linux) to overcome the Windows limitations.

Because in Windows we have firewall issues, I decided to setup the Ray head on a pure Linux server we have. The idea of this server is just to manage the Ray cluster and the more demanding computational tasks will be done by Ray workers through the WSL.

I am using the Ray version 2.34.0 on all machines.

I successfully started the Ray head and on the dashboard everything seems great. Then, when I try to setup a node on a WSL the worker initially appears on my dashboard but after some seconds it is killed due to missing heartbeats.

I have tried a Linux-Linux configuration and everything works as expected. The worker is added to the cluster and the connection is stable.

For the WSL configuration, I have tried a couple of things already. I disabled the firewall but no success. With netsh, I added a routing rule to forward the messages on a range of ports (I tested starting the Ray head with the --worker-port-list option to define a list of ports) from my Windows machine to the WSL (the ip address used to start the Ray worker is the one from my Windows machine since this is the one visible by the network). But nothing is working…

At this stage, I have no idea what to try next and would be great to have some help from you guys.

Is there anything I am missing? I am a newbie on this networking stuff so there is a high chance I am doing some silly mistake…

Thanks for the help.
Andre

Topic		Replies	Views
Health check failed due to missing too many heartbeats Ray Clusters	0	304	July 17, 2024
Running worker nodes in WSL (Windows) Ray Clusters	1	651	October 24, 2023
Unable to connect to linux head with windows worker Ray Clusters	1	99	November 14, 2024
Unable to connect to head node Ray Clusters	4	782	July 12, 2022
Ray headnode/worker on Windows server	1	16	August 26, 2024

Cannot create a Ray Worker on WSL

Related topics