Executing Ray Train with PyTorch

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I am trying to migrate my code to use Ray Train, etc. However, when tried to execute the FASHION MNIST Example from Ray Docs,
I get following error:


RuntimeError: use_libuv was requested but PyTorch was build without libuv support

The error is stemming from:

site-packages\torch\distributed\rendezvous.py", line 189, in _create_c10d_store
return TCPStore(
^^^^^^^^^
RuntimeError: use_libuv was requested but PyTorch was build without libuv support

Because I’m running it on a Windows CPU machine at the moment, so I switched the GPU flag to False. Following are my library versions with Python 3.11.0:

PyTorch version: 2.5.1+cpu
Ray version: 2.40.0

I have tried several fixes but those include setting the libuv flag to False which I can’t as I’m using the that from within the Ray.

Hi.

This issue seems to be an issue with your PyTorch installation. In PyTorch 2.4, libuv was made the default backend for TCPStore initialization: Introduction to Libuv TCPStore Backend — PyTorch Tutorials 2.5.0+cu124 documentation

I’m not too sure of the right way to build on Windows with libuv support, and there even seems to be an open issue for the same problem: PyTorch defaults to using libuv but is built without support for it on Windows · Issue #139990 · pytorch/pytorch · GitHub

As an alternative, you can use the older backend by setting USE_LIBUV=0 in your environment. Make sure to add this environment variable at ray initialization with runtime_env

For the Fashion MNIST example, you can do the following:

if __name__ == "__main__":
+  ray.init(runtime_env={"env_vars": {"USE_LIBUV": "0"}}) 
    train_fashion_mnist(num_workers=4, use_gpu=True)

This way, all the Ray Train workers will have this environment variable set.

1 Like

Really appreciate your help!