Failure to serialize response

I am running Ray on Kubernetes. From a Pod, when I connect to a Ray cluster and run some commands, I noticed that I sometimes get this error. I am wondering in what circumstances this happens.

As an FYI, the client and server Python/Ray versions are matching

>>> ray.is_initialized()
True
>>> ray.nodes()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/share/runtimes/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 104, in wrapper
    return getattr(ray, func.__name__)(*args, **kwargs)
  File "/usr/local/share/runtimes/lib/python3.9/site-packages/ray/util/client/api.py", line 208, in nodes
    return self.worker.get_cluster_info(
  File "/usr/local/share/runtimes/lib/python3.9/site-packages/ray/util/client/worker.py", line 598, in get_cluster_info
    resp = self.server.ClusterInfo(
  File "/usr/local/share/runtimes/lib/python3.9/site-packages/grpc/_channel.py", line 946, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/usr/local/share/runtimes/lib/python3.9/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.NOT_FOUND
	details = "Failed to serialize response!"
	debug_error_string = "{"created":"@1646705403.742381878","description":"Error received from peer ipv4:10.100.214.52:10001","file":"src/core/lib/surface/call.cc","file_line":903,"grpc_message":"Failed to serialize response!","grpc_status":5}"
>
3 Likes

FYI I’ve raised a similar issue to yours in this post: https://discuss.ray.io/t/ray-grpc-ambiguous-error-message/5874; hoping to get a response!

Perhaps the suggested solution on the other thread could help.