How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
I’m new to Ray, but so far so good! Using the Ray Server doc examples I’ve built a “Driver” ingress deployment that dispatches (by URL) between two other deployments, one for “/embeddings” and one for “/inference”. It works well!
Driver.__init__()
captures the deployments, and Driver.__call__()
accepts a Request
and dispatches between them based on request.url.path
, e.g.:
if request.url.path.startswith('/embeddings'):
ref = await self.embedding.classify.remote(req)
...
return await ref #produces a Response
I’d now like to serve a Gradio page under the “/chat” url. The standalone deployment works like a charm:
def gradio_builder():
def respond(message, user_bot_history):
return random.choice(["Yes", "No"])
iface = gr.ChatInterface(respond)
#without this there is a constant polling loop of some kind
iface.config["dev_mode"] = False
return iface
@serve.deployment(
#route_prefix="/chat",
ray_actor_options={"num_cpus": 0},
autoscaling_config={"min_replicas": 1, "max_replicas": 1},
)
class MyGradioServer(GradioIngress):
def __init__(self):
super().__init__(gradio_builder)
deploy = MyGradioServer.bind()
Now, finally, my question; How do I combine this deployment with my Driver ingress deployment such that it supports “/embeddings”, “/inference”, and “/chat” from the same server?