Docker / Apps instantly shutdown after done creating replica

Hello guys,
I have a problem when trying to deploy FastAPI apps to the cluster. My container always stopping after adding a new replica to the cluster and after checking the dashboard there’s no error in my apps that would give me an error and also there’s no error in the dashboard when deploying too.

Setup

  • Head Container / Head Node REDIS PORT 5000 || DASHBOARD PORT 5535
  • App Container / API Port 8000

Code

from fastapi import FastAPI

import ray
from ray import serve

ray.init(address="head-node:5000", namespace="hello")
serve.start(detached=True)

app = FastAPI()


@serve.deployment(num_replicas=1)
@serve.ingress(app)
class TestAPI:
    @app.get("/hello")
    def hello(self):
        return {"message": "Hello World"}


TestAPI.deploy()

Logs

INFO 2023-09-28 13:55:58,882 controller 385 controller.py:363 - Finished recovering deployments after 0.00s.
INFO 2023-09-28 13:55:58,883 controller 385 http_state.py:512 - Starting HTTP proxy with name 'SERVE_CONTROLLER_ACTOR:SERVE_PROXY_ACTOR-eceaa0b840d51c5dc2e3390afd960ebbe0ce071dd3a8d745954bf0d0' on node 'eceaa0b840d51c5dc2e3390afd960ebbe0ce071dd3a8d745954bf0d0' listening on '127.0.0.1:8000'
INFO 2023-09-28 13:55:59,777 controller 385 deployment_state.py:1390 - Deploying new version of deployment TestAPI.
INFO 2023-09-28 13:55:59,818 controller 385 deployment_state.py:1679 - Adding 1 replica to deployment TestAPI.
INFO 2023-09-28 13:55:59,819 controller 385 deployment_state.py:345 - Starting replica TestAPI#Ksxzsk for deployment TestAPI
INFO 2023-09-28 13:56:00,759 controller 385 deployment_state.py:1827 - Replica TestAPI#Ksxzsk started successfully on node eceaa0b840d51c5dc2e3390afd960ebbe0ce071dd3a8d745954bf0d0.
INFO 2023-09-28 13:57:39,586 controller 385 deployment_state.py:1390 - Deploying new version of deployment TestAPI.
INFO 2023-09-28 13:57:39,641 controller 385 deployment_state.py:952 - Stopping replica TestAPI#Ksxzsk for deployment 'TestAPI'.
INFO 2023-09-28 13:57:39,646 controller 385 deployment_state.py:1560 - Stopping 1 replicas of deployment 'TestAPI' with outdated versions.
INFO 2023-09-28 13:57:42,203 controller 385 deployment_state.py:2027 - Replica TestAPI#Ksxzsk is stopped.
INFO 2023-09-28 13:57:42,204 controller 385 deployment_state.py:1679 - Adding 1 replica to deployment TestAPI.
INFO 2023-09-28 13:57:42,204 controller 385 deployment_state.py:345 - Starting replica TestAPI#qUYwTu for deployment TestAPI
INFO 2023-09-28 13:57:43,146 controller 385 deployment_state.py:1827 - Replica TestAPI#qUYwTu started successfully on node eceaa0b840d51c5dc2e3390afd960ebbe0ce071dd3a8d745954bf0d0.
INFO 2023-09-28 14:12:44,171 controller 385 deployment_state.py:1390 - Deploying new version of deployment TestAPI.
INFO 2023-09-28 14:12:44,200 controller 385 deployment_state.py:952 - Stopping replica TestAPI#qUYwTu for deployment 'TestAPI'.
INFO 2023-09-28 14:12:44,206 controller 385 deployment_state.py:1560 - Stopping 1 replicas of deployment 'TestAPI' with outdated versions.
INFO 2023-09-28 14:12:46,765 controller 385 deployment_state.py:2027 - Replica TestAPI#qUYwTu is stopped.
INFO 2023-09-28 14:12:46,765 controller 385 deployment_state.py:1679 - Adding 1 replica to deployment TestAPI.
INFO 2023-09-28 14:12:46,765 controller 385 deployment_state.py:345 - Starting replica TestAPI#ccedmz for deployment TestAPI
INFO 2023-09-28 14:12:48,018 controller 385 deployment_state.py:1827 - Replica TestAPI#ccedmz started successfully on node eceaa0b840d51c5dc2e3390afd960ebbe0ce071dd3a8d745954bf0d0.

The application status is empty

Is there something i did wrong when deploying ?

Thanks !!

Hi @Steven_Adi_Santoso, welcome to the forums!

This does look strange. What Ray version are you using? Is there any chance you’re running this script on a loop in the controller? It looks like the application is constantly being updated.

Could you also try changing the namespace to "serve"? All Serve actors run in that namespace.

I’d also recommend upgrading to the Ray Serve 2.x API. The 1.x API (serve.start() and .deploy()) is deprecated and may not be supported in future versions.