Serialization error on single laptop

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hello, I’ve been training a custom environment with Ray 2.5.1 for some time. The environment has proven robust. But training a single agent with Tune occasionally results in strange exceptions coming from deep within Ray. The latest example resulted in the following error message after several thousand iterations on my single laptop with a GPU. It halted the entire tune program, not just one worker.

2023-08-13 21:06:08,017	ERROR serialization.py:387 -- 
Traceback (most recent call last):
  File "/home/starkj/miniconda3/envs/cda0/lib/python3.10/site-packages/ray/_private/serialization.py", line 232, in _deserialize_msgpack_data
    obj = MessagePackSerializer.loads(msgpack_data, _python_deserializer)
  File "python/ray/includes/serialization.pxi", line 191, in ray._raylet.MessagePackSerializer.loads
  File "python/ray/includes/serialization.pxi", line 192, in ray._raylet.MessagePackSerializer.loads
  File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackb
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 16: invalid continuation byte

I don’t know where to begin figuring it out. Any help would be most appreciated. Thanks.

Thanks for bring this up to us. @starkj this issue looks like a serialization issue in the tune job. I was wondering if you can consistently reproduce this issue, or it is an one-off issue. If the former, could you share the reproduce script so that we can debug further?

Thanks,
cc: @matthewdeng for visibility

@XIE, @matthewdeng unfortunately, this particular error I have seen only occasionally. Say. 2-3 times in past month. But you are welcome to look at the code in GitHub - TonysCousin/cda0: Initial prototype AI agent for cooperative driving automation (CDA).