Performance of passing large object as remote function argument

yangw1234 · January 22, 2021, 7:23am

Hi!

Is there any difference, in terms of performance, between passing large object to remote functions directly and passing the object ref returned by ray.put.

E.g.

big_array = np.random.randn(1024, 1024, 1024)
some_function.remote(big_array)

Or

big_array = np.random.randn(1024, 1024, 1024)
big_array_ref = ray.put(big_array)
some_function.remote(big_array_ref)

kai · January 22, 2021, 9:53am

Hi @yangw1234, passing a large object will call ray.put() under the hood. Thus if you pass the array to exactly one remote call, there shouldn’t be a performance difference.

However, if you’re aiming to pass the big array more than once, Ray will push the object to the object store multiple times, which will create unnecessary overhead, both in terms of time and memory usage.

See also here in the docs: Tips for first-time users — Ray v2.0.0.dev0

Topic		Replies	Views
Pros and Cons of passing ObjectRefs in a container Ray Core	7	581	March 23, 2021
Is it an atipattern to put a function with closure to ray object storage	0	131	January 15, 2024
Remote function parameter object handling Ray Core	5	2624	November 30, 2020
Using ray.put for LARGE numpy arrays Ray Core	12	1506	July 27, 2023
Sending an attribute of an object to a remote function through the object ref Ray Core	1	376	November 24, 2022

Performance of passing large object as remote function argument

Related topics