Help designing fire and forget server for large batch inference

Well, after posting this here I did some digging and found this thread Job Queue in Ray? - #24 by ericl

Which led me to these RFCs

And to discovering the existence of Ray Workflows, which could cover my use cases I believe :slight_smile:

But anyway, I’d appreciate guidelines on how to use Workflows to design something like that, if anybody has some tips or best practices. I also know it’s still in Alpha.

Excited to see the development of the Async requests on Serve as well!
What I’d suggest is to make Ray Workflows more visible, it took me quite a while to find out this existed, and I think use cases like mine are becoming more and more common (as the second RFC suggests) :slight_smile: