Performance of passing large object as remote function argument

Hi!

Is there any difference, in terms of performance, between passing large object to remote functions directly and passing the object ref returned by ray.put.

E.g.

big_array = np.random.randn(1024, 1024, 1024)
some_function.remote(big_array)

Or

big_array = np.random.randn(1024, 1024, 1024)
big_array_ref = ray.put(big_array)
some_function.remote(big_array_ref)

Hi @yangw1234, passing a large object will call ray.put() under the hood. Thus if you pass the array to exactly one remote call, there shouldn’t be a performance difference.

However, if you’re aiming to pass the big array more than once, Ray will push the object to the object store multiple times, which will create unnecessary overhead, both in terms of time and memory usage.

See also here in the docs: Tips for first-time users — Ray v2.0.0.dev0