Deploying ML Model using Ray serve on K8s

So, I followed this documentation to deploy ML model on ray cluster in K8s (Deploying Ray Serve — Ray v1.2.0).

Sorry if my questions are silly as I am new to ray !

My queries are -

  1. Ray cluster is already initialized in K8s (which was done following link - Deploying on Kubernetes — Ray v1.2.0). So, if I run ray submit command to run my python model pipeline file, do I have to use the yaml (example-full.yaml) mentioned in the document? Would it identify the existing ray cluster and update in it or create a new one?

  2. Is there any other way to serve my models without using ray submit command? Like by using kubectl command? If not, what parameter should I add in the ray submit command to let ray know to use a specific kubeconfig file?

  3. How to containerize my model file and deploy on ray cluster ? Could you please share a sample docker file for it?

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hi @SivaSankariRamamoort, welcome to the forums! Glad to hear that you’re exploring Ray.

Out of curiosity, is there a reason you’re using Ray 1.2.0? That version is a bit outdated (the current release is 1.13 and Ray 2.0 is launching later this year). I’m not super familiar with Ray 1.2, but I’ll do my best to answer your questions–

  1. That YAML file is for the Ray Cluster Launcher tool, so I don’t think you need to include it in your ray submit file. However, you should make sure to expose port 8000 on your Kubernetes cluster, so it can forward requests to your deployments (which use port 8000) by default.

  2. You should be able to use kubectl to set up your models. You can kubectl exec into your cluster, start Serve with the CLI command serve start, and then run a Python script that deploys your deployments to your Kubernetes cluster.

  3. Ray offers some default Docker files through DockerHub. You can also search for posts in this forum from other users that rely on Docker. For example, this post might be helpful.

I recommend checking out some more recent Ray releases if you’re able to. For example, Ray 2.0 will offer a more ops-friendly workflow for running Ray Serve on Kubernetes using Kuberay. You’ll be able to configure your deployment directly through the operator, so you won’t have to manually kubectl exec into your cluster to manager your deployments.