How severe does this issue affect your experience of using Ray?
- High
Hello, I have been able to create a deployment using ray serve, which takes documents and processes them (NLP inference). I followed the examples from Set Up a gRPC Service — Ray 2.9.1 and was able to implement grpc, with the server running at localhost:9000
.
I am able to send requests from the client using the following:
# Running on osx 14.2
# grpc==1.5.9
# protobuf==4.25.1
# ray[serve]==2.9.1
import grpc
from my_service_pb2_grpc import MyServiceStub
from my_service_pb2 import MyMessage, MyResponse
options = [
('grpc.max_message_length', 1024 * 1024 * 1024),
('grpc.max_send_message_length', 1024 * 1024 * 1024),
('grpc.max_receive_message_length', 1024 * 1024 * 1024),
]
channel = grpc.insecure_channel("localhost:9000", options=options)
stub = MyServiceStub(channel)
# data is a list of dicts that conforms to the protobuf schema
request = MyMessage(documents = data)
response, call = stub.__call__.with_call(request=request)
When data
is sufficiently small (<4 mb), everything runs well. However when data
is >4 mb I receive an error such as:
_InactiveRpcError Traceback (most recent call last)
Cell In[12], line 1
----> 1 response, call = stub.__call__.with_call(request=request)
File ~/miniconda3/envs/myservice/lib/python3.10/site-packages/grpc/_channel.py:1178, in _UnaryUnaryMultiCallable.with_call(self, request, timeout, metadata, credentials, wait_for_ready, compression)
1163 def with_call(
1164 self,
1165 request: Any,
(...)
1170 compression: Optional[grpc.Compression] = None,
1171 ) -> Tuple[Any, grpc.Call]:
1172 (
1173 state,
1174 call,
1175 ) = self._blocking(
1176 request, timeout, metadata, credentials, wait_for_ready, compression
1177 )
-> 1178 return _end_unary_response_blocking(state, call, True, None)
File ~/miniconda3/envs/myservice/lib/python3.10/site-packages/grpc/_channel.py:1004, in _end_unary_response_blocking(state, call, with_call, deadline)
1002 return state.response
1003 else:
-> 1004 raise _InactiveRpcError(state)
_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.RESOURCE_EXHAUSTED
details = "Received message larger than max (8468525 vs. 4194304)"
debug_error_string = "UNKNOWN:Error received from peer ipv6:%5B::1%5D:9000 {created_time:"2024-01-31T14:09:19.248513-05:00", grpc_status:8, grpc_message:"Received message larger than max (8468525 vs. 4194304)"}"
Things I have done
- As you can see above, I have set
'grpc.max_send_message_length'
and'grpc.max_receive_message_length'
when creating the channel and passing in options. - I also ran into errors with receiving large data, but on the client side setting
'grpc.max_receive_message_length'
in options fixed this. This leads me to believe this is an issue with ray serve - I have looked at the code in
gcs_util.py
andray_constants.py
and it seems that all the constants point to message sizes of 250 - 500 mbs, and these options should be passed to the grpc server - I have also ran into this error on a windows 10 machine in which my service is able to receive and pass back responses without issue as long as the initial MyMessage size is <4 mb.
- I have tried downgrading to
ray 2.9.0
and am receiving the same errors.
I am really stuck as this seems to be coming from ray serve, and would appreciate help in solving this as it is preventing me from moving forward. Thanks in advance!