How to expose Ray Serve for production/testing?

Hi. I have Ray Serve running on a Ray Cluster on Kubernetes, which I deployed using the ray/doc/kubernetes/ray-cluster.yaml configuration file. Is there a simple or canonical way to expose this server so that I can make network requests to it from the host machine or other machines on the LAN?

I am new to Kubernetes. It seems like I might need to add a new Kubernetes service and/or a NodePort or Load Balancer. I thought I would ask here in case there is already some simpler or built-in way of doing this that I am missing.

Thanks.

1 Like

Hi @Casey,

If you are just using demo and quick development, you can use:

  • NodePort if you are able to access the ports on k8s nodes
  • ClusterIP + kubectl proxy if we cannot directly access the port

For production, I would recommend you using LoadBalancer type. It will provision a load balancer if you are running with cloud provider, or other solutions depending on the specific k8s offering.

Regardless of the service type, you should point the k8s service to the serve port on all Ray nodes.

1 Like