Automatic-deserialization? - Question about Numpy arrays

How severe does this issue affect your experience of using Ray?

  • None: Just asking a question out of curiosity

Hi there, can someone explain to me whats going on here? I’ve read that " Ray supports zero-copy-read for numpy (for float or integer types) array."

but I’m not sure if this is the same as my phenomenon.

I have self play workers fulfilling roll outs and returning numpy arrays. in the main thread these results are Object refs as expected. I then pass these to a replay buffer using an add method, but i notice that theres significant delay between when the workers finish, and when the buffer adds them. This lead me to consider whether the arrays is being passed to the buffer rather than the reference and now i’m somewhat confused about whats going on here. Could someone explain?

Here’s the code:

ps = Policy_Server.remote(DEVICE)
sp = self_play_worker.options(max_concurrency=MAX_WORKERS).remote(ps, DEPTH)
buffer = ReplayBuffer.options(max_concurrency=2).remote(capacity=CAPACITY)

sp_refs = [sp.self_play.remote() for _ in range(GAMES_TOTAL)]
buffer_refs = []
start_time = time.time()
# ? Preload the ReplayBuffer

print(f"Number of Jobs Running: {len(sp_refs)=}")
while sp_refs:
     complete, sp_refs = ray.wait(sp_refs)
     print(f"{complete=}")
     buffer.add.remote(complete)

In the above, i get the following thus far and this is what i expected.

Number of Jobs Running: len(sp_refs)=1
complete=[ObjectRef(f91b78d7db9a659302e1adadcf69844dfe0a8bc60100000001000000)]

the buffers code however is:

    def add(self, experience: ray.ObjectRef):
        print(f"{experience} - add function")
        print(f"{type(experience)=}")
        print(f"{type(experience[0])}")
        data = ray.get(experience)
        print(data)
        print(type(data))
        assert False

This gives the following print out:

ValueError: Invalid type of object refs, <class 'numpy.ndarray'>, is given. 'object_refs' must either be an ObjectRef or a list of ObjectRefs.
(ReplayBuffer pid=527599) [ObjectRef(f91b78d7db9a659302e1adadcf69844dfe0a8bc60100000001000000)] - add function
(ReplayBuffer pid=527599) type(experience)=<class 'list'>
(ReplayBuffer pid=527599) <class 'ray._raylet.ObjectRef'>
(ReplayBuffer pid=527599) [array([[1.000e+00, 1.000e+00, 1.000e+00, ..., 0.000e+00, 1.829e+03,
(ReplayBuffer pid=527599)         0.000e+00],
(ReplayBuffer pid=527599)        [1.000e+00, 1.000e+00, 1.000e+00, ..., 0.000e+00, 1.837e+03,
(ReplayBuffer pid=527599)         0.000e+00],
(ReplayBuffer pid=527599)        [1.000e+00, 1.000e+00, 1.000e+00, ..., 0.000e+00, 2.997e+03,
(ReplayBuffer pid=527599)         0.000e+00],
(ReplayBuffer pid=527599)        ...,
(ReplayBuffer pid=527599)        [0.000e+00, 0.000e+00, 0.000e+00, ..., 0.000e+00, 4.105e+03,
(ReplayBuffer pid=527599)         0.000e+00],
(ReplayBuffer pid=527599)        [0.000e+00, 0.000e+00, 0.000e+00, ..., 0.000e+00, 3.510e+03,
(ReplayBuffer pid=527599)         0.000e+00],
(ReplayBuffer pid=527599)        [0.000e+00, 0.000e+00, 0.000e+00, ..., 0.000e+00, 2.358e+03,
(ReplayBuffer pid=527599)         0.000e+00]])]
(ReplayBuffer pid=527599) <class 'list'>
(ReplayBuffer pid=527599) [[1.000e+00 1.000e+00 1.000e+00 ... 0.000e+00 1.829e+03 0.000e+00]
(ReplayBuffer pid=527599)  [1.000e+00 1.000e+00 1.000e+00 ... 0.000e+00 1.837e+03 0.000e+00]
(ReplayBuffer pid=527599)  [1.000e+00 1.000e+00 1.000e+00 ... 0.000e+00 2.997e+03 0.000e+00]
(ReplayBuffer pid=527599)  ...
(ReplayBuffer pid=527599)  [0.000e+00 0.000e+00 0.000e+00 ... 0.000e+00 4.105e+03 0.000e+00]
(ReplayBuffer pid=527599)  [0.000e+00 0.000e+00 0.000e+00 ... 0.000e+00 3.510e+03 0.000e+00]
(ReplayBuffer pid=527599)  [0.000e+00 0.000e+00 0.000e+00 ... 0.000e+00 2.358e+03 0.000e+00]] - add function
(ReplayBuffer pid=527599) type(experience)=<class 'numpy.ndarray'>
(ReplayBuffer pid=527599) <class 'numpy.ndarray'>

I mean it looks like it at least gets to the buffer as an object reference, but i dont understand why it doesnt stay that, given that it seems to be called twice. once as a Object Reference, and once again as a numpy array. (without ray.get() being called)

but i dont understand why it doesnt stay that, given that it seems to be called twice. once as a Object Reference, and once again as a numpy array. (without ray.get() being called)

Hi @KAV101 , I’m not sure I understand the question.
experience is passed in as a list of ObjectRef, and once you call ray.get() on it, it resolves to the data, in this case a list of numpy array.

Hi there,

Perhaps this helps - Consider an alternative situation.

I have some ray generator in the style of Ray-generator docs

I’ve followed their example of unpacking the generator like so

ready, unready = [], [*gen]
    buffer_jobs = []
    while unready:
        ready, unready = ray.wait(unready)
        for r in ready:
            if isinstance(r, ObjectRefGenerator):
                try:
                    ref = next(r)
                    buffer_jobs.append(buffer.add.remote(ref))
                except StopIteration:
                    pass
                else:
                    unready.append(r)
            else:
                print("O hai Dare!")

From here, I’m expecting ref to be an ObjectRef which i can then pass to buffer via its add method.

When i look inside buffer’s add method i ask it to print out what it’s just recieved - Still expecting a ObjectRef. but instead i find a np.ndarray.

So i guess my question is when did it go from an ObjectRef to the array?

heres buffers code:

def add(self, experience):
        print(f"{experience=}") # <== Expecting ObjectRef, Get np.ndarray
        exp = ray.get(experience) # <== Failing because not an ObjectRef
        print(f"{exp.shape=}")
        for e in exp:
            self._add(e)
        return True

with console output:

(ReplayBuffer pid=91426) experience=array([[1.000e+00, 1.000e+00, 1.000e+00, ..., 0.000e+00, 1.837e+03,
(ReplayBuffer pid=91426)         0.000e+00],
(ReplayBuffer pid=91426)        [1.000e+00, 1.000e+00, 1.000e+00, ..., 0.000e+00, 7.700e+01,
(ReplayBuffer pid=91426)         0.000e+00],
(ReplayBuffer pid=91426)        [1.000e+00, 1.000e+00, 1.000e+00, ..., 0.000e+00, 1.837e+03,
(ReplayBuffer pid=91426)         0.000e+00],
(ReplayBuffer pid=91426)        ...,
(ReplayBuffer pid=91426)        [0.000e+00, 0.000e+00, 0.000e+00, ..., 0.000e+00, 3.510e+03,
(ReplayBuffer pid=91426)         0.000e+00],
(ReplayBuffer pid=91426)        [0.000e+00, 0.000e+00, 0.000e+00, ..., 0.000e+00, 4.089e+03,
(ReplayBuffer pid=91426)         0.000e+00],
(ReplayBuffer pid=91426)        [0.000e+00, 0.000e+00, 0.000e+00, ..., 0.000e+00, 4.089e+03,
(ReplayBuffer pid=91426)         0.000e+00]])

hope that clears things up?