Deploying ML Model using Ray serve on K8s

shrekris · August 8, 2022, 7:04pm

Hi @SivaSankariRamamoort, welcome to the forums! Glad to hear that you’re exploring Ray.

Out of curiosity, is there a reason you’re using Ray 1.2.0? That version is a bit outdated (the current release is 1.13 and Ray 2.0 is launching later this year). I’m not super familiar with Ray 1.2, but I’ll do my best to answer your questions–

That YAML file is for the Ray Cluster Launcher tool, so I don’t think you need to include it in your ray submit file. However, you should make sure to expose port 8000 on your Kubernetes cluster, so it can forward requests to your deployments (which use port 8000) by default.
You should be able to use kubectl to set up your models. You can kubectl exec into your cluster, start Serve with the CLI command serve start, and then run a Python script that deploys your deployments to your Kubernetes cluster.
Ray offers some default Docker files through DockerHub. You can also search for posts in this forum from other users that rely on Docker. For example, this post might be helpful.

I recommend checking out some more recent Ray releases if you’re able to. For example, Ray 2.0 will offer a more ops-friendly workflow for running Ray Serve on Kubernetes using Kuberay. You’ll be able to configure your deployment directly through the operator, so you won’t have to manually kubectl exec into your cluster to manager your deployments.

Topic		Replies	Views
Ray serve on Kubernetes Ray Serve	14	919	March 27, 2024
Deploy Ray Serve on K8s Cluster Ray Serve	1	1101	November 10, 2022
Ray Serve for production usecase Ray Serve	3	709	May 25, 2022
Continuous Delivery with Ray Serve Ray Serve	3	1028	May 10, 2021
Ray serve on K8s Ray Serve	1	624	April 5, 2021

Deploying ML Model using Ray serve on K8s

Related topics