Ray Serve with vs without FastAPI

akelloway · March 4, 2021, 4:03am

New to Ray Serve (been using Ray/RLLib/Tune for a while) and have successfully followed the doc tutorials including the FastAPI example. My use case is straightforward model deploying/serving. My question is what benefits do I get from the FastAPI example vs “pure” Ray Serve? Why use one vs the other? Which, if any, is faster? Or is the FastAPI example there for folks that already have an extensive deployed FastAPI code base and now want to add Ray to the mix for the distributed model inferencing speed gains?

Thanks in advance for insights! Just starting to get my head around Ray Serve and how I could use it for my potential use cases/applications.

eoakes · March 4, 2021, 4:39pm

Really great question @akelloway

The main benefit of using Ray Serve with FastAPI is you get the full flexibility of their (awesome) HTTP server. That means you can define pydantic types and FastAPI will automatically cast the incoming requests to them, you can easily define multiple routes and variable paths, you can use their dependency injection system for DB connections, auto-generated OpenAPI spec, etc. Ray Serve’s built-in HTTP server is a bit lower-level and doesn’t offer these convenience features right now so you will need to operate at a slightly lower level.

I would say a good rule of thumb is if you’re just doing model serving on Ray Serve, using the built-in server is probably “good enough” and will “serve” you well. If you want to build a fully-featured scalable web application, using FastAPI and scaling out the backends probably makes sense. Hope this helps!

Note that we’re currently cooking up a plan to provide the best of both worlds with “native” support. We’ll track it on this issue for future readers: [serve] use fastapi as backend? · Issue #9869 · ray-project/ray · GitHub.

eoakes · March 4, 2021, 4:41pm

Oh, another reason to use the FastAPI support (that may not be too relevant to you, but for others): if you already have an existing FastAPI server and want to scale it up using Ray Serve you can do that without changing the HTTP code.

akelloway · March 4, 2021, 5:54pm

@eoakes – thanks for the insightful answers - I now have a better understanding of the considerations for each approach – not sure which is “better” suited for my application right now, but I am in a better position to make that decision going forward. Thanks!

Topic		Replies	Views
FastAPI vs Ray FastAPI performance	3	154	July 25, 2024
FastAPI + Ray Core vs FastAPI + Ray Serve? Ray Serve	2	1128	March 16, 2021
Best Practices for expanding FastAPI app Ray Serve	1	931	October 13, 2023
Ray Serve with FastAPI slowing down performance Ray Serve	1	488	July 19, 2023
Ray Serve, FastAPI python package/module example? Ray Serve	2	517	February 22, 2021

Ray Serve with vs without FastAPI

Related topics