How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hello, how is it possible to deploy multiple models which are not in the same graph? i.e. I want to deploy A and B and query them with url/a, url/b.
I tried using multiple serve.run()
or serve run file:deployment
, but new runs always remove the previous deployment.
Also the RayService config has only one import_part
, how can we deploy multiple services in one RayService?
Thanks
3 Likes
You can create independent deployments and bind them to a single DAGDriver
. Then, you can serve.run()
/serve run
that single DAGDriver
.
This section in the docs provides more details.
Yeah, although a single DAGDriver is a bit problematic if you have multiple endpoints (last example on the page linked above) and at least one of them has a fastapi ingress - the paths don’t work out as you’d expect.
1 Like
Incidentally: the dictionary form for constructing a DAGDriver isn’t mentioned in the general purpose docs for 2.0 (as far as I can see) only in the migration guide.
When I was converting from ray 1.11 to 2.0, I was still able to deploy multiple models like this:
Model1.options(init_args=model1_args, num_replicas=1, name="model1").deploy()
Model2.options( init_args=model2_args, num_replicas=num_reps, name="model2", ray_actor_options={"num_gpus": 1} ).deploy()
That being said, I was making direct calls on the model and not via the url served endpoints.
I did eventually convert over to use the DAG just to see how to works
@PaulRudin I’m not sure about the “the paths don’t work out as you’d expect”, do you have an example?
For me these two options work:
Option 1:
import requests
from fastapi import FastAPI
from ray import serve
app = FastAPI()
@serve.deployment()
@serve.ingress(app)
class Serve:
@app.get("/ping1")
def ping1(self):
return "pong1"
@app.get("/ping2")
def ping2(self):
return "pong2"
serve.run(Serve.bind())
resp = requests.get("http://localhost:8000/ping1")
assert resp.json() == "pong1"
resp = requests.get("http://localhost:8000/ping2")
assert resp.json() == "pong2"
Option 2:
import requests
from ray import serve
from ray.serve.drivers import DAGDriver
@serve.deployment()
class Ping1:
def __call__(self):
return "pong1"
@serve.deployment()
class Ping2:
def __call__(self):
return "pong2"
d = DAGDriver.bind({"/ping1": Ping1.bind(), "/ping2": Ping2.bind()})
handle = serve.run(d)
resp = requests.get("http://localhost:8000/ping1")
assert resp.json() == "pong1"
resp = requests.get("http://localhost:8000/ping2")
assert resp.json() == "pong2"
For reference, this doesn’t work:
import requests
from fastapi import FastAPI
from ray import serve
@serve.deployment(route_prefix="/ping")
class Ping1:
def __call__(self):
return "pong1"
@serve.deployment(route_prefix="/ping")
class Ping2:
def __call__(self):
return "pong2"
serve.run(Ping1.bind())
serve.run(Ping2.bind())
resp = requests.get("http://localhost:8000/ping1")
assert resp.json() == "pong1"
resp = requests.get("http://localhost:8000/ping2")
assert resp.json() == "pong2"
Hi Ray Serve team, I would like to check if ray serve currently supports to use case mentioned in the question? To deploy multiple models via multiple serve run
on different URL_Path, like: serve run model1:deployment
on /model1
and serve run model2:deployment
on /model2
. Thanks.
Hi @lizzzcai ,
As the discuss mentioned, right now you can only be able to deploy multiple models with single serve run
in 2.0 (1.x to 2.x API Migration Guide — Ray 2.0.0).
The old api is still supported until we provide future api to use for your use case. (How to run multiple deployments in ray serve 2.0 - #5 by puntime_error)
Thanks @Sihan_Wang for your reply. Just want to know that is it something in the roadmap or it will need to open up an issue, thanks.
Hi, we just publish a RFC about the multi applications API.
Welcome comments & suggestions! Thank you!