New FastAPI HTTP Deployments running on uvicorn

I am a bit confused by the new FastAPI HTTP Deployment and I am having trouble getting it running with my existing application.

I am currently running my application with uvicorn on port 8080. I am able to launch serve with http_options=dict(location='NoServer")

I tried using APIRouter and the app instance, but I keep on getting an error

ray.serve.exceptions.RayServeException: serve.get_replica_context() may only be called from within a Ray Serve backend.

serve is started as follows:

ray.init(address="auto", ignore_reinit_error=True, namespace="serve")
serve.start(
            http_options=dict(location="NoServer"), detached=True,
        )

FastAPI is started as follows:

ray.init(address="auto", ignore_reinit_error=True, namespace="serve")

uvicorn.run(
        "app:app", port=project_config.app.port, log_level=project_config.app.log_level
    )

My code is taken from the example:

from fastapi import FastAPI
from ray import serve

app = FastAPI()


@serve.deployment(name="deployment", route_prefix="/api")
@serve.ingress(app)
class MyFastAPIDeployment:
    @app.get("/")
    def root_handler(self):
        return "hello from /api"

    @app.get("/{subpath}")
    def subpath_handler(self, subpath: str):
        return f"hello from /{subpath}"


MyFastAPIDeployment.deploy()

Afaik, you shouldn’t run uvicorn directly in the serve runtime, since the cluster is responsible for maitaining a single uvicorn server per node. What you should do instead is call serve.start() to initialize the said runtime prior to deployment, or follow the deployment guide to serve in a long-lived Ray cluster.

1 Like

Hi @amiasato ,

I launched ray serve in detached mode and without an http server. In my case I am using my own uvicorn server.

Do you have any reason to use you own uvicorn server? This was one of the old ways of integrating FastAPI with Ray Serve and is currently deprecated.

2 Likes

@Javier_Bosch I think there is a bit of a mixture of approaches happening here. In general, as @amiasato mentioned the recommended way to integrate with FastAPI is the native integration. If you want to run your own HTTP server and call Serve from within it, you can do that but you don’t need to use the @serve.ingress decorator at all. You can simply deploy and call your deployment using the regular Python API:

# This can all happen in your FastAPI server.
Deployment.deploy()
handle = Deployment.get_handle()
ray.get(handle.remote())