Deploying ML Model using Ray serve on K8s

Hi @SivaSankariRamamoort, welcome to the forums! Glad to hear that you’re exploring Ray.

Out of curiosity, is there a reason you’re using Ray 1.2.0? That version is a bit outdated (the current release is 1.13 and Ray 2.0 is launching later this year). I’m not super familiar with Ray 1.2, but I’ll do my best to answer your questions–

  1. That YAML file is for the Ray Cluster Launcher tool, so I don’t think you need to include it in your ray submit file. However, you should make sure to expose port 8000 on your Kubernetes cluster, so it can forward requests to your deployments (which use port 8000) by default.

  2. You should be able to use kubectl to set up your models. You can kubectl exec into your cluster, start Serve with the CLI command serve start, and then run a Python script that deploys your deployments to your Kubernetes cluster.

  3. Ray offers some default Docker files through DockerHub. You can also search for posts in this forum from other users that rely on Docker. For example, this post might be helpful.

I recommend checking out some more recent Ray releases if you’re able to. For example, Ray 2.0 will offer a more ops-friendly workflow for running Ray Serve on Kubernetes using Kuberay. You’ll be able to configure your deployment directly through the operator, so you won’t have to manually kubectl exec into your cluster to manager your deployments.