FastAPI backend + Ray Core vs Ray Serve

vaporeon · August 10, 2025, 7:06am

I have an existing backend FastAPI server handling business logic. How can I extend this beckend to add a simple machine learning model that inference a POST request?

I have read Have read Ray with FastAPI - Ray Core - Ray and FastAPI + Ray Core vs FastAPI + Ray Serve? - Ray Serve - Ray but there’s no solution.

From Ray Serve docs, using @serve.deployment seems to mean placing my backend in Ray cluster, with entrypoint serve run xx:xx. What’s the advantage of this approach?

Another approach that makes more sense to me is to keep the original backend and extend the computation part with Ray, with serve.deployment DeploymentHandle API. How can I achieve this?

This is the Actor running inference code:

@serve.deployment()
class KeywordExtractor:
    def __init__(self):
        self.model = KeyBERT()

    def keyword_extract(self, doc:str):
        model_output = self.model.extract_keywords(doc, keyphrase_ngram_range=(1, 1), stop_words=None)
        return model_output

keyword_extractor = KeywordExtractor.bind()

And I’m looking to call it from an endpoint.


app = FastAPI()

@app.post("/extract_keywords")
async def extract_keywords(doc:Doc):

    if not doc.text:
        raise HTTPException(status_code=400, detail={"error":"No text provided","hint": "Please include a non-empty 'text' field in the request body."})
    
    keywords_sim = await keyword_extractor.keyword_extract.remote(doc.text)
    keywords_sim = ray.get(keywords_sim)
    keywords = [kw[0] for kw in keywords_sim]
    return {
        "keywords": keywords
    }

christina · August 18, 2025, 8:21pm

Hello and welcome to the Ray community Love the username!

You can try extending your existing FastAPI backend by deploying your ML model as a Ray Serve deployment and then calling it from your FastAPI endpoint using the DeploymentHandle API. This allows you to keep your original FastAPI server and offload only the ML inference to Ray Serve, without migrating your entire backend into the Ray cluster. Check out Ray documentation it might be helpful! Let me know if you still have any questions.

Topic		Replies	Views
Best Practices for expanding FastAPI app Ray Serve	1	1004	October 13, 2023
Ray with FastAPI Ray Core	1	919	December 24, 2023
Unable to get started with Ray Serve + FastAPI Ray Serve	1	1848	January 8, 2023
Ray Serve with vs without FastAPI Ray Serve	3	1854	March 4, 2021
Ray Serve FastAPI Recommended Approach Ray Serve	1	1334	August 10, 2021

FastAPI backend + Ray Core vs Ray Serve

Related topics