How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
Hello everyone,
thanks for building the awesome framework. It help me a lot recently when it come to deploy large open-source language model.
Now, I’m building a text embedding server with ray serve. Specifically I want to use the Dynamic Batching features of Ray Serve. Now my code is working good but I have two question:
-
Does Ray Serve Dynamic Batching work with FastAPI ingress, I haven’t seen any example on how to do that. Now I use ray serve 100 % without any FastAPI intergration in order to get this feature working.
-
When using Dynamich Batching: how could I read the Header, not only the request?
I would like to add the authentication bearer to my serve.
Currently my code without Header is like that:
@serve.batch(max_batch_size=8,batch_wait_timeout_s=0.5)
async def handle_batch(self, request_list):
for id, request in enumerate(request_list):
try:
content = await request.json()
....
async def __call__(self, request: Request) -> List[str]:
return await self.handle_batch(request)
How can I add Header into that:
I tried:
@serve.batch(max_batch_size=8,batch_wait_timeout_s=0.5)
async def handle_batch(self, request_list, header_list):
for id, request in enumerate(request_list):
try:
content = await request.json()
....
async def __call__(self, request: Request, header: Header) -> List[str]:
return await self.handle_batch(request, header)
but it keep saying:
traceback (most recent call last):
(ServeReplica:default_Embedding_Server pid=11547) File "/usr/local/lib/python3.10/dist-packages/ray/serve/_private/replica.py", line 633, in invoke_single
(ServeReplica:default_Embedding_Server pid=11547) result = await method_to_call(*request_args, **request_kwargs)
(ServeReplica:default_Embedding_Server pid=11547) TypeError: Embedding_Server.__call__() missing 1 required positional argument: 'header'
Thanks in advance. It could help me a lot if I can solve this.