I figured it out after reading this thread. Bascially I need to add the following lines to the serve.py:
import ray
ray.init()
serve.start(detached=True, http_options={"host": "0.0.0.0"})
Then in the docker container, simply run: python serve.py
to start serving.
The full serve.py file is as follows for future reference:
from ray import serve
from starlette.requests import Request
from typing import Dict
import ray
ray.init()
serve.start(detached=True, http_options={"host": "0.0.0.0"})
@serve.deployment
class Hello:
def __init__(self):
pass
async def __call__(self, starlette_request: Request) -> Dict:
req = await starlette_request.json()
name = req["name"]
return {"response": f"Hello {name}!"}
serve.run(Hello.bind())