I was wondering if there is a way that I can use allreduce in core ray? The way I picture this is something like:
# on a remote worker: ray.allreduce(x, operation='sum', axis_name='foo', axis_size=16) x # now holds the sum of all x's across all 16 workers
I imagine that ray would block on the remote workers until it has collected
axis_size=16 calls to allreduce along
axis_name='foo'. It would then do the allreduce operation and then lift the block.
Is something like this possible? If not, do you think it’s a good idea to add this? I would love to have this kind of functionality.