Ray Serve with FastAPI slowing down performance

Hi, i have my application written in fastapi and it is able to handle 64 requests/second (testing with 1000 users), but when i wanna integrate ray serve into it, the performance is extremely bad (roughly 1.2 requests/second). I followed the example from the docs and warp my fastapi application with following code (other part remain unchanged):

@serve.deployment(route_prefix="/")
@serve.ingress(app)
class FastAPIWrapper:
    pass

Am I missing something? (edited)

@Jackson can you please elaborate a little bit on the issue you’re observing?

  • How are you running your benchmark?
  • Are you able to share screenshots from your Ray Serve dashboard (example below)? Particularly i’d suggest to take a look a the Tasks/Actors section and Resource Utilization