Making model accesible across the nodes on Ray Serve - how

ivan_prado · October 31, 2023, 11:17am

How severe does this issue affect your experience of using Ray?

Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hi,

We have developed a gRPC service over ray serve. But are struggling to have the model distributed so that it is accessible when the init on the service is invoked. We don’t want the model to be in an external file storage (e.g. s3), but instead to be able to deploy it directly to the ray cluster. Ideally, Ray should be handling the distribution automatically.

We have been exploring Ray Objects, but this is not the right thing to do as it works at the Python objects level, occupies memory, and is slow.

An alternative is to manually copy the model to the nodes, but it has the problem of adding bureaucracy and won’t work well with autoscaling.

What’s the right way of doing it with Ray? Thank you very much.

shrekris · November 9, 2023, 5:50pm

Hi @ivan_prado, welcome to the forums!

Could you create a custom Docker image that contains the model (relevant docs)? That way, when a node starts running, the model is immediately accessible.

ivan_prado · November 10, 2023, 10:29am

Thank you @shrekris for answering!

One problem I see with the docker approach, and maybe I’m not understanding it well, is that we would have to restart the whole cluster whenever we want to deploy a new model, and this would stop the service and other related services within the same cluster.

Or maybe I’m not understanding it well. How would you do deployments using the new docker images without interrupting the service?

shrekris · November 10, 2023, 7:20pm

You would have to start a new cluster. One way to do this without interrupting the service is to:

Start the new cluster, and start the new Serve application on it.
Shift traffic from your old cluster to your new cluster.
Shut down the old cluster.

This is the pattern that KubeRay uses to perform zero-downtime updates.

Topic		Replies	Views
Automating the serving of many different models Ray Serve	8	1677	May 3, 2023
Ray serve on Kubernetes Ray Serve	14	935	March 27, 2024
[Ray Serve] using GRPC and DAG to host multiple models(or actors) in the same deployment	3	425	February 2, 2023
Dynamically serve new model via Ray Serve Ray Serve	5	74	June 11, 2025
Serve containerized model using Ray Serve? Ray Serve	0	357	July 12, 2023

Making model accesible across the nodes on Ray Serve - how

Related topics