How to View Results of Post Request with Ray Serve Batching?

Hi all,

I am trying to do some load testing to see how many computations in parallel can be handled by my server. I noticed Ray Serve allows passing in batches. I wrote a piece of code based on documentation, it does get deployed but I am unable to see the actual output.

All I wanna do as POC is pass in a list of strings, and output the same list of strings.
Here is the backend:

@serve.deployment(route_prefix="/test",ray_actor_options={“num_gpus”: 0.5},num_replicas = 2)
class BatchedBackend:
@serve.batch
async def handle_batch(self, requests: List):
results =

    for request in requests:
        results.append(request.json())
    return results

async def __call__(self, request):
    await self.handle_batch(request)

Now, when I try to send post request, it keeps failing.

data = [‘This is a test’,‘Just testing on batches’]
requests.post(“http://127.0.0.1:8000/test”, data = data )

Am I supposed to use json in post request? If so, how? Would appreciate help

I think our default call signature expects a Starlette Request object so the input types are mismatched. You can try a simple case on your localhost on simple CPU but try to handle and parse requests in the __call__ function.

A simple test would be investigate what you have by checking request.query_params, then proceed from there.

You can see examples in:

https://docs.ray.io/en/master/serve/http-servehandle.html#servehandle-calling-deployments-from-python