Deployment Graph vs ServeHandle

rabraham · May 27, 2022, 9:09pm

Hi,
What is the benefit of using the new Deployment Graph API vs composing different ServeHandles?

Thanks,
Rajiv

Jiao_Dong · May 27, 2022, 9:37pm

Hi @rabraham great question. In short, they’re equivalent in power and expressiveness. This is more of a stance we take between providing a static graph vs. dynamic graph, where a static definition makes it much easier for us to build new operationalizing APIs for multi-model inference, and provide room needed for further performance optimization.

You can see more of this in our blog post:

The graph structure is hidden in the logic of the entire codebase. This means our users need to read the entire codebase and track down each handle in order to see how the graph is composed, without a static definition of the topology. In order to test the graph, users need to manually write a deployment script that deploys each deployment in the correct topological order

It’s hard to operationalize the deployment graph for production. Given the observation above, it’s also difficult to take some operational actions of the deployment graph without diving into the codebase, such as adjusting parameters like num_replicas , update link to latest model weights, etc.

It can be hard to optimize the graph. If you look carefully at the code snippet above, it called await in a loop as an anti-pattern, since we will wait for each result to return one by one instead of parallelizing them. With a static graph definition, we can avoid these performance bugs and provide advanced support for optimizations such as fusing or co-locating nodes on the same host.

rabraham · May 31, 2022, 9:38am

Hi @Jiao_Dong
Thank you very much for that answer. It really helps. I’ll have to slowly unpack that article but it looks great so far. I’ll let you know if I have more questions.

rabraham · May 31, 2022, 10:17am

Great Article. Thanks!

Minor question:

IR in “Expressing parallel calls is very trivial: just use the same variable. The same variable name (backed by an IR node behind the scene) will be resolved to the same value in separate nodes.”

And a small fyi,
typo in general_model = Genenal_Classification.bind(weights="s3://bucket/file")

Jiao_Dong · May 31, 2022, 5:39pm

Great catch and I just notified our content team to update it, thanks for reading our blog post so carefully

rabraham · June 1, 2022, 12:50am

writing a thesis is of some use after all

Topic		Replies	Views
Serve Pipeline Design Doc -- Open for comments and collaboration Ray Serve	0	434	February 1, 2022
[Serve] New API not as good as old one for programmatic deployment	0	314	October 5, 2022
Dynamic Deployment on Ray Serve Ray Serve	3	136	March 4, 2025
Cloudflow(RISELab) and Ray Serve Deployment Graph API Ray Serve	1	397	June 24, 2022
[Serve] Graph How many requests Ray Serve	2	478	June 6, 2022

Deployment Graph vs ServeHandle

Related topics