Facing Serialization issues

SumanthDatta · June 14, 2021, 2:45pm

Hi ,

when using Parallel Iterator on a Custom Object, which contains pyarrow data. We are facing Serialization Exceptions. PFA for the screenshot.

Ray version : dev -2.0.0
Python: 3.7.9

When we debug the ray code, inside cloudpickle_fast.py line no 668 , for the following snippet, it always returns Not implemented.

        try:
            is_anyclass = issubclass(t, type)
        except TypeError:  # t is not a class (old Boost; see SF #502085)
            is_anyclass = False

        if is_anyclass:
            return _class_reduce(obj)
        elif isinstance(obj, types.FunctionType):
            return self._function_reduce(obj)
        else:
            # fallback to save_global, including the Pickler's
            # dispatch_table
            return NotImplemented

Is there any workaround to bypass this issue or a fix can be provided based on evaluation. Any help is appreciated.

sangcho · June 16, 2021, 11:35pm

cc @suquark Have you faced this error? I remember you mentioned some of Pyarrow type has serialization issues. Is there good workaround for this?

suquark · June 17, 2021, 5:42am

This is the first time I see this error. To me this error indicates the input data contains some Cython objects which are not serializable (Cython objects are not serializable generally because cloudpickle/pickle cannot access the bytecode hidden by Cython). A simple reproducible example would be very helpful. One workaround is to use alternative representations like https://discuss.ray.io/t/cant-pickle-pyarrow-dataset-expression/1685/8; another often useful workaround is to avoid defining custom classes in the entrypoint script (if this was the case).

SumanthDatta · June 17, 2021, 5:09pm

Yes we identified the issue , not passing cython object to ray.

selvaganesang · June 17, 2021, 5:18pm

When the error is reported, it would be useful if the error contains the type of the object that caused the serialization issue. Then it will be easier for the caller to identify the issue.

Topic		Replies	Views
Facing issues when upgraded to ray 2.0.0 Ray Core	5	475	April 8, 2021
Can't pickle pyarrow.dataset.Expression Ray Serve	9	1774	June 7, 2021
Ray client: Can't pickle custom serialized object Ray Client	2	2255	February 3, 2022
RuntimeError: Failed to unpickle serialized exception Ray Core	1	1924	June 20, 2022
[Dask on Ray] Parallelizing Rasa's DaskGraphRunner - Problem with serializing SQLAlchemy objects Ray Core	3	895	January 27, 2022

Facing Serialization issues

Related topics