Zero-copy deserialization with recursive dictionaries/lists

Hi there,

I’m curious on the limits of Ray’s zero-copy deserializability. My current understanding is that if a function returns a (large, over 100KB by default) numpy array, it will be put in the Plasma object store, so that any other executing tasks can use it without copying any memory.

Does this guarantee apply to Numpy arrays stored in dictionaries? What about dictionaries of dictionaries?

As per the docs at Serialization — Ray v2.0.0.dev0, You can often avoid serialization issues by using only native types (e.g., numpy arrays or lists/dicts of numpy arrays and other primitive types), or by using Actors hold objects that cannot be serialized. But it’s not clear to me whether it handles recursive lists/dictionaries.

Would zero-copy deserialization work with the following? Or, how can I tell whether zero-copy deserialization occurred so I can test it myself?

a)

def a(x):
    return {'a': large_numpy_array(x)}

b)

def b(x):
    return {'a': {'b': large_numpy_array(x)}}

c)

def c(x):
    return {'a': [{'b': large_numpy_array(x)}], 'c': large_numpy_array(x)}

Yep, Ray supports all of these cases with zero-copy. There’s an easy way to tell:

x_id = ray.put({“x”: {“y”: np.zeros(5)}})
x = ray.get(x_id)
x[“x”][“y”].flags
C_CONTIGUOUS : True
F_CONTIGUOUS : True
OWNDATA : False
WRITEABLE : False
ALIGNED : True
WRITEBACKIFCOPY : False
UPDATEIFCOPY : False

WRITABLE = False, which is sure sign that we’re using zero-copy memory from plasma.