Hi all,
I am trying to do some load testing to see how many computations in parallel can be handled by my server. I noticed Ray Serve allows passing in batches. I wrote a piece of code based on documentation, it does get deployed but I am unable to see the actual output.
All I wanna do as POC is pass in a list of strings, and output the same list of strings.
Here is the backend:
@serve.deployment(route_prefix="/test",ray_actor_options={“num_gpus”: 0.5},num_replicas = 2)
class BatchedBackend:
@serve.batch
async def handle_batch(self, requests: List):
results =
for request in requests:
results.append(request.json())
return results
async def __call__(self, request):
await self.handle_batch(request)
Now, when I try to send post request, it keeps failing.
data = [‘This is a test’,‘Just testing on batches’]
requests.post(“http://127.0.0.1:8000/test”, data = data )
Am I supposed to use json in post request? If so, how? Would appreciate help