Ray serve on Kubernetes

OAfzal · March 15, 2022, 12:31pm

I have coded a pipeline of multiple ray serve deployments which run a Deep Learning Model on the user input and respond with the output of the Model. So firstly there are multiple standalone ray serve deployment classes which run one model on the input (these endpoints can also be called by the user). Then there is another deployment class which gets a handle to all standalone modules and runs them, concatenates there output and sends out to the user. I now wanted to move this @serve deployments to K8s, where I can use autoscaling. I am able to start a minikube cluster and launch a ray cluster using the provided helm chart, but now I don’t understand how am I to move the ray serve deployments classes I have to K8s and serve to users. Could someone please direct me to a resource or tutorial for doing this?

eoakes · March 17, 2022, 10:01pm

Hi @OAfzal you can follow this example in the documentation to deploy on kubernetes:
https://docs.ray.io/en/latest/serve/deployment.html#deploying-on-kubernetes

OAfzal · March 21, 2022, 8:48am

So I was able to get the above to work. I wanted to know that is it necessary to initialize a ray cluster on already existing k8 cluster to deploy a ray serve application or I could just containerize multiple microservices with ray serve for serving and would that allow the ability for model composition or communication between the containers using handles or not?

OAfzal · March 21, 2022, 10:05am

I would also like to know that if a script is using a saved_model file to load a model how would that model file be passed to the cluster. I am referring to running a script on the cluster using ray submit. If the file sent with ray submit has a file dependency how should that be catered?

eoakes · March 21, 2022, 4:47pm

@OAfzal to your first question, if you want to do model composition using ray serve, all of the models (deployments) should be running on the same ray cluster.

To your second question, for production usage I’d recommend baking that saved_model file into your container image and loading it from disc in the deployment constructor. You could also use ray’s working_dir support, but this is more dynamic than building it into the container and therefore there is more potential for failure.

OAfzal · March 21, 2022, 7:43pm

So for every model do I create a container and deploy it using kubectl? Secondly is it necessary to initialize ray clusyer using the helm chart?

eoakes · March 21, 2022, 7:52pm

You don’t need to create an individual container for each model, you can use the helm chart to create a multi-pod Ray cluster, then deploy Serve on that cluster (the models will run across the different pods).

OAfzal · March 21, 2022, 8:12pm

Do I submit the model Deoloyment script using ray submit (like in the script above). But you mentioned creating a container image with the model in it. How would I do something like this. I am sorry if my questions are a bit basic. I just started working with k8s. So just trying to learn.

eoakes · March 21, 2022, 10:58pm

Ah, what I was suggesting is to include the model file in your dockerfile (or however you are building your container image). If you aren’t building the image yourself or want to get off the ground quickly, using the working_dir option to runtime_env would be a good option too:
https://docs.ray.io/en/latest/ray-core/handling-dependencies.html#using-local-files

SivaSankariRamamoort · August 18, 2022, 11:52am

Hi,
So, my situation is similar to the one discussed here.. I have a pipeline, model & a set of dependent files for the pipeline to run. I am trying to deploy this into ray cluster which is installed on kubernetes.

Understanding is that, ‘ray submit’ command would place the file that I pass in argument and runs in head node pod. But, as I described above, I have a set of files along with the model file. So, I see, it is suggested to include the files in dockerfile and deploy as a container separately.

Here are my questions

Can I just use python3 as base image or should I be using rayproject as base image (https://hub.docker.com/r/rayproject/ray) ? Is there any doc I can refer?
Could you share a sample yaml file to deploy this as separate pod?
Would ray still do autoscaling of pods when running as separate pod?

architkulkarni · August 18, 2022, 4:58pm

Hi @SivaSankariRamamoort , we’ve recently updated the Ray Kubernetes documentation: Ray on Kubernetes — Ray 3.0.0.dev0 Can you see if it answers your questions? The getting started guide references some example YAML files. Autoscaling will still work as usual.

SivaSankariRamamoort · August 18, 2022, 5:42pm

Hi Archit Kulkarni,

Thanks for taking time to reply. But, the line you have shared is on how to deploy ray cluster into kubernetes. I’m looking for doc to deploy ray serve on kubernetes.

architkulkarni · August 18, 2022, 5:52pm

I think the answers to your questions should be the same whether using Ray Serve or any other Ray application, if I’m understanding them correctly. Once your Ray cluster is deployed you can deploy Serve on it in the usual way.

You might also be interested in RayService - KubeRay Docs which you can test out with pip install "ray[serve, default]==2.0.0rc1".

Kalpesh · March 13, 2024, 1:57am

I am facing something along these line. I am creating a static ray cluster and I am trying to deploy multiple independent models with independent routes on a single ray cluster. I want to auto scale not only the worker but also the model replicas according to requests and I am not using Kuberay.

There are not many resources on how can I serve multiple models on a single static cluster.

shrekris · March 27, 2024, 9:52pm

Check out these docs on deploying multiple applications with Ray Serve: Deploy Multiple Applications — Ray 2.10.0. You can split each independent model into a separate application and run them on a single cluster.

Topic		Replies	Views
Deploying ML Model using Ray serve on K8s Ray Serve	1	1200	August 8, 2022
Ray serve on K8s Ray Serve	1	631	April 5, 2021
Running 10+ models on a ray cluster Kubernetes	1	588	February 27, 2022
Automating the serving of many different models Ray Serve	8	1775	May 3, 2023
Deploy Ray Serve on K8s Cluster Ray Serve	1	1132	November 10, 2022

Ray serve on Kubernetes

Related topics