[Serve] Is it possible to serve a model without running a cluster

dirtyValera · September 3, 2023, 6:53am

My use case is following:

I have a ray cluster which users use to train/validate/tune models and run some custom user defined logic against those models or their combinations (defined with deployment graph), the models are served using serve, all of this happens on top of a single ray cluster. As a result users pick a single model they want to use further.

Now I want to be able to let users run their UDF + selected model/model graph on top of any other platform, not just ray (e.g. in a simple python process). Is it possible to do?

AFAIK serve.run() spins up ray actors under the hood, is it possible to just run it as a separate process/thread/async loop? So the users don’t have dependency on running ray cluster in their apps?

dirtyValera · September 5, 2023, 5:22am

Bumping this, any help?

Alexandre_Quadra · August 14, 2024, 1:26pm

Bumping this again…
Any Help? I am also interested on this topic…

Topic		Replies	Views
Automating the serving of many different models Ray Serve	8	1738	May 3, 2023
Serve containerized model using Ray Serve? Ray Serve	0	357	July 12, 2023
Single node separation of env	0	229	September 6, 2023
Dynamically serve new model via Ray Serve Ray Serve	5	97	June 11, 2025
Ray serve on Kubernetes Ray Serve	14	948	March 27, 2024

[Serve] Is it possible to serve a model without running a cluster

Related topics