Hi team,
Is there a plan that Ray would integrate with vllm production-stack as both have a router? cc @kevin85421
Hi team,
Is there a plan that Ray would integrate with vllm production-stack as both have a router? cc @kevin85421
I opened a GH issue [RayLLM] RayLLM / vLLM production stack integration · Issue #53331 · ray-project/ray · GitHub.
@mtsai We plan to have a production stack with ray serve LLM at some place. Production-stack sub-repo could be a good place but haven’t decided on it. We are waiting for our Routing and prefill-disagg and DP + EP features to land first. Our serve llm solution would be more self contained at the application layer (routing, p/d deployment, dp + ep, etc will all be controlled within a single application), so it may not fit exactly with generic k8s native customizations which may include multiple container deployments.