How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hello everyone, my issue is the following. I’m trying to autobatch request to send them to OpenAI API using the batch decorator.
It’s initially working fine but after 11 requests, the batch is sent to the API (even if I should be able to accumulate 1000s of requests into the batch).
From what I understand this was likely a bottleneck from the proxy. One solution proposed in this discussion was to have more replicas and nodes but this is not possible for my task because I have to reduce the number of batch I send as much as possible (due to the limitations of the OpenAI API) .
I tried multiple things, including disabling the proxy with serve.start(http_options={"proxy_location": ProxyLocation.Disabled})
but that was unsucessful. I even attempted to write my own batcher thinking that was the bottleneck but that seems to be indeed a problem with the input request, after 11 requests, my server kind of pause to digest the requests and once the answer has been sent back it continuing by doing the same thing (11 request, process, send back, 11 request , process, send back … etc).
I’m open to any suggestion to solve the issue