Which implementation use ray.util.collective.allreduce of allReduce ? Does each worker communicate only with its two neighbors like a ring ?
How severe does this issue affect your experience of using Ray?
Ray will delegate allreduce implementation to the collective backend, which currently can be Gloo or NCCL. I am not that familiar with Gloo, but NCCL uses a complicated formula for determining whether to use tree- or ring-based allreduce.
Thank you for your answer.