Hi, I’m experimenting serve.batch:
First of all, I start a local cluster by ray start --head
. Then I launch serve by ‘serve start’. Next, I deploy below simple code to the serve:
`
import time
from typing import List
from pydantic import BaseModel
from ray import serve
from starlette.requests import Request
class PricingRequest(BaseModel):
valuation_date: str
pid: str
payoff: dict
cmm: dict
@serve.deployment(’/present_value’)
class PricingApi:
@serve.batch
async def present_value(self, reqs: List[Request]):
print(f’batched {len(reqs)} reqs’)
data = []
for r in reqs:
p = await r.json()
data.append(p)
return [{‘pid’: r[‘pid’], ‘pv’: 0.0} for r in data]
async def __call__(self, req: Request):
return await self.present_value(req)
ray.init(namespace=‘serve’)
PricingApi.deploy()
`
So far, everything works fine and I’m able to call the api by curl post. Then I start a Java application that sends post requests in parallel. However, from the server, I saw all logs like ‘batched 1 reqs’. I understand serve.batch is opportunistic but I would assume above example should work.
I’m using latest nightly build and python 3.8.
Thanks,
-BS