How to force serialisation of numpy array using buffer and not cloudpickle?

Hi all! Recently have been using ray to farm out some compute heavy tasks. Things were going slower than I expected, so I set up py-spy to look at what is going on, and it turns out that serialisation is the largest overhead.

This was unexpected, because I’m sending a numpy array, and returning a numpy array.

Looking at the flame chart above, serialisation is all in blue, and I notice that there’s a fun cloudpickle.dumps in the serialisation at the end when the actor returns its result (a 1D numpy array).

Does anyone know how I can dig into this deeper to figure out why exactly numpy isn’t being sent via buffer? I’m at a total loss debugging this myself!


Hi @Samuel_Hinton ! Hm, that is unexpected based on my understanding. Could you share a code snippet that shows how you’re sending/recving a numpy array?