Why you choose msgpack for ray-c++ and protobuf for ray-python? why not unify the serialization tools?

Hi, I have a quick question for serialization, why you choose msgpack for ray-c++ and protobuf for ray-python? why not unify the serialization tools? thanks!

Hey, I think you may be conflating 2 different layers of serialization. In python, we use cloudpickle for serializing code and application level objects.

As you may imagine, cloudpickle is used in the python layer only. From that point on, the lower layers of the ray core treat it as a byte string, which, along with some additional information (like a task id, resources, etc) need to be sent to other processes/nodes. Since we use gRPC, protobufs are the natural choice at this layer.

So in the python serialization stack, we do cloudpickle for language specific constructs, and protobuf in the core.

For C++, msgpack replaces cloudpickle, but we still use protobufs internally.

Thank you so much! It’s clear now!