How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it (will prototype without Ray Serve).
Hi! I’m struggling to get started with Ray Serve for serving my pytorch models. I’m trying to achieve the following:
- I have a FastAPI server that serves user requests and interacts with a database to support the required business logic.
- Whenever users hit the
/api/test
endpoint, it should perform inference using pytorch model, apply some business logic on the result, and respond to the client. - I would like to retain the other routes and auth middleware already active on the FastAPI server.
I read the following resources:
- docs here
- Ray: Serve ML models docs (Link redacted because I’m a new user and can’t post more than 2 links)
- And found a few interesting examples online (e.g. this one)
I tried to start with a simple test, and used the following code from the docs adding it to a file called test.py
:
import ray
import requests
from fastapi import FastAPI
from ray import serve
app = FastAPI()
# NOTE: This is my custom addition to serve some static pages.
app.mount("/static", StaticFiles(directory="static"), name="static")
@serve.deployment(route_prefix="/hello")
@serve.ingress(app)
class MyFastAPIDeployment:
@app.get("/")
def root(self):
return "Hello, world!"
@app.post("/{subpath}")
def root(self, subpath: str):
return f"Hello from {subpath}!"
serve.run(MyFastAPIDeployment.bind())
resp = requests.post("http://localhost:8000/hello/Serve")
assert resp.json() == "Hello from Serve!"
When running python test.py
, I see the following:
2023-01-06 18:15:14,000 INFO worker.py:1529 – Started a local Ray instance. View the dashboard at 127.0.0.1:8265
(ServeController pid=19356) INFO 2023-01-06 18:15:15,660 controller 19356 http_state.py:129 - Starting HTTP proxy with name ‘SERVE_CONTROLLER_ACTOR:SERVE_PROXY_ACTOR-a90809961c7f36842d67eba6574e31d92c448022b3263c841b3b1b11’ on node ‘a90809961c7f36842d67eba6574e31d92c448022b3263c841b3b1b11’ listening on ‘127.0.0.1:8000’
(HTTPProxyActor pid=19944) INFO: Started server process [19944]
(ServeController pid=19356) INFO 2023-01-06 18:15:16,320 controller 19356 deployment_state.py:1310 - Adding 1 replica to deployment ‘MyFastAPIDeployment’.
(HTTPProxyActor pid=19944) INFO 2023-01-06 18:15:19,354 http_proxy 127.0.0.1 http_proxy.py:361 - POST /hello 200 8.0ms
(ServeReplica:MyFastAPIDeployment pid=1336) INFO 2023-01-06 18:15:19,352 MyFastAPIDeployment MyFastAPIDeployment#AFMcPK replica.py:505 - HANDLE call OK 4.0ms
But the server seems to die right after this gets executed, e.g. it does not live beyond the last assert. I tried to use serve run test:MyFastAPIDeployment
as well, but with no luck.
My expectation would be that, with some command, a FastAPI server would spin up and stay up, serving my requests. Is my mental model wrong? Is there any working example of a FastAPI server serving a model with some minimal/toy business logic and non model endpoints?
Thank you and sorry if this is a very basic/dumb question.