How to force serialisation of numpy array using buffer and not cloudpickle?

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hi all! Recently have been using ray to farm out some compute heavy tasks. Things were going slower than I expected, so I set up py-spy to look at what is going on, and it turns out that serialisation is the largest overhead.

This was unexpected, because I’m sending a numpy array, and returning a numpy array.

Looking at the flame chart above, serialisation is all in blue, and I notice that there’s a fun cloudpickle.dumps in the serialisation at the end when the actor returns its result (a 1D numpy array).

Does anyone know how I can dig into this deeper to figure out why exactly numpy isn’t being sent via buffer? I’m at a total loss debugging this myself!

Cheers!

1 Like

Hi @Samuel_Hinton ! Hm, that is unexpected based on my understanding. Could you share a code snippet that shows how you’re sending/recving a numpy array?