Using Ray over InfiniBand

How severe does this issue affect your experience of using Ray?

  • None: Just asking a question out of curiosity
    I use use a slurm script in a cluster to use ray. The cluster has InfiniBand. InfiniBand is a channel-based fabric that facilitates high-speed communications between interconnected nodes. I know the ray is built on gRPC which uses TCP/IP. But can ray run over InfiniBand? Is there any plan to support using ray over InfiniBand?

No plans for the foreseeable future! But I can imagine some paths that could be beneficial (like object transfer).

Hey @Chen_Shen aren’t we planning to support this?

This mainly just requires the ability to specify a particular network card.

hey @xyzyx do you have the setup that you want to use InfiniBand?
Currently, Ray doesn’t support InfiniBand, mainly because there is yet a standardized, easy-to-use API to use it, compared to ethernet. However, you might able to still use it via Ethernet over InfiniBand.

That’s said, we’d be happy to learn more about your use case and explore this option.

I’m using the IB network interface (instead of the default) to communicate between the Ray workers in a Slurm cluster. I imagined that would make the communication faster. Isn’t that the case?

@vakker00 yeah if it’s ethernet over IB network it’s planning to be supported. [Feature] [core] Selecting network interface · Issue #22732 · ray-project/ray · GitHub

Hello @rliaw . Infiniband support would be great. Usually people instead of supporting directly the verbs API uses UCX (dask for instance has a backend using py-ucx) or libfabrics (for instance on AWS is the way to use the EFA networking stack on both mpi and nccl applications) as a targets. There are some projects like https://mercury-hpc.github.io/ that tries to provide a RPC interface that supports this kind of hardware or the tensorpipe library that also supports this kind of interface but is focused on point to point communications of tensors. However most of these libraries are somewhat low level than the gRPC one.

Yes, Using the IB network can make communication faster. But some features like RDMA are not fully used. I wonder if Ray can use these features to accelerate the speed of communication.

@sangcho hello, I want to help ray to support RDMA for object transferring.
RDMA is good at transferring memory data without CPU interruption. With the help of InfiniBand, I think it will improve ray performance.

As you said, It will be beneficial. I think I can Consider Ray as tensorflow which support grpc+verbs and grpc+MPI. I plan to sperate the object store part and make it support RDMA as tensorflow does.

I also do some test on ray and MPI with different network. In ethernet, Ray and MPI used TCP for data transfer as baseline. MPI can speed up 15X with IPoIB and Ray varys from 2X to 15X which has more potential. MPI can speed up 50X with RDMA and I think ray can get same improvement.

I have reviewed Ray’s code and it is a sophisticated project. I try to focus on the object store part which is correlated with object transferring in Ray’s withepaper. It help me but I think it is out of date. I cannot fiugre it out how the data transfered when ray run. Is there any up-to-date documents I can referr to?

If I want to contribute to the Ray project, where should I start from? Any import class or file should I pay attention to?

For contributing you can start here

https://docs.ray.io/en/latest/ray-contribute/getting-involved.html

Thank you for your advise! :smiley:

Marking this as resolved. @xyzyx looking forward to your contribution!

1 Like