Help designing fire and forget server for large batch inference

jonaz · November 30, 2023, 10:00am

Thanks @eoakes for you detailed answers!

Regarding this, I think I actually tried that but with no success. For some reason, I didn’t manage to make a deployment_handle.remote() call from inside a Workflow task/step – probably I’m using the API wrong at some point. I actually made another thread about that here Workflow calling Deployment.remote()?, maybe you could check it out? I’d really appreciate it

Topic		Replies	Views
Workflow calling Deployment.remote()? Ray Workflows	0	275	November 28, 2023
Ray Serve: custom resource optimization Ray Serve	3	374	January 26, 2023
Optimal way to handle for loop with multiple await calls Ray Serve	6	775	June 22, 2022
Ray Serve is executing the requests sequentially instead parallel even after configuring auto-scale Ray Serve	11	351	October 20, 2023
Ray with FastAPI Ray Core	1	246	December 24, 2023

Help designing fire and forget server for large batch inference

Related Topics