You cannot expose each Ray Serve application on a different port within a single Ray cluster; Ray Serve only supports one HTTP server (one port) per cluster. The recommended approach is to deploy multiple Ray Serve applications (microservices) on a single Ray cluster using the multi-application API, assigning each application a unique route_prefix (e.g., /service1, /service2). KubeRay can manage a single Ray cluster and deploy multiple Serve applications into it, but all will share the same HTTP(S) port—differentiation is by route, not port. To expose each app on a different external port, use a Kubernetes Ingress or reverse proxy (like NGINX) to map external ports or paths to the appropriate route_prefixes within the shared Ray Serve HTTP endpoint.
For best practices: define all applications in a single serveConfigV2 YAML under the applications field, each with a unique name and route_prefix. Use KubeRay’s RayService CRD to manage the cluster and Serve deployments. For port-based routing, configure your Ingress or Service to direct traffic to the shared Ray Serve port and route_prefixes. See Ray Serve on Kubernetes production guide, multi-app services, and community discussion for details.
Would you like a step-by-step example YAML or Ingress configuration?
Sources:
- Ray Serve on Kubernetes production guide
- Multi-application services
- Set different HTTP port for different deployments (Ray Discuss)
Hint: Mention @RunLLM in the post for followups.