Serve Pipeline Design Doc -- Open for comments and collaboration

jiaodong · February 1, 2022, 6:49pm

Hi ray serve community,

In the last a few weeks, Ray Serve team has put up a public design doc for Serve Pipeline that aims to provide the best API to author an inference graph of models as a pipeline as well as artifacts & APIs for operationalizing serve deployments. It indents to cover multi-model inference as well as large model partitioning such as distributed inference.

It’s an evolution of our existing Alpha API where a lot of pieces will be changed.

We’re actively looking for community comments, feedbacks and collaboration on this effort, please don’t hesitate to let us know by commenting on the doc, on this thread, our slack channel or email us at serving@anyscale.com, thanks !

Best,
Ray Serve Team

Topic		Replies	Views
Torch Ensemble serving Ray Serve	4	754	March 2, 2023
Inference pipeline examples Ray Serve	3	699	January 31, 2022
Sequence/Tensor Parallelism with Ray Serve	2	416	May 23, 2024
About the Ray Serve category Ray Serve	0	794	November 17, 2020
Deployment Graph vs ServeHandle Ray Serve	5	571	June 1, 2022

Serve Pipeline Design Doc -- Open for comments and collaboration

Related topics