Hello, thanks for building a great library!
I am not sure if this is possible, but basically I would like to have concurrent ray.get() calls running in a Dataloader so that multiple processes are bringing data from the object store onto the driver process at once while doing other work in a pipelined manner. The code looks a bit like this:
# Somwhere in driver code
refs = ray.put([my_big_data_A, my_big_data_B, my_big_data_C])
def ray_get_worker(ref):
return ray.get(ref)
# Later on in Dataloader
def __next__():
# I want to fetch all handles concurrently while doing other work is ongoing
ray.get(refs) # too slow AND blocks
p = mp.Pool(4)
parallel_get = p.imap(ray_get_worker, refs) # doesn't block and in theory
would have 4x workers (doesn't work though ;)
# work that takes 500ms (I would like ray.get() to be 'running' during this time)
# grab "ray.get()" results
for result in parallel_get:
# do something with local data
I read that Ray is faster than multiprocessing (which would have to serialize returning the large data structures to the driver) so understand that using multiprocessing here won’t work but am looking for an equivalent way to ray.get() multiple refs in a non blocking, synchronous manner, if such a thing exists!