Concurrency groups in Serve Deployments

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hello,

I am using Ray Serve Deployments with multiple methods. One method is for the heavy compute, and the other methods are light and for getting the status of the deployment. I would like the compute method to be in a separate concurrency group than the lighter methods. I see there exists “concurrency_groups” for ray.remote() but I dont see the same for serve.deployment(). Does there exist some way to have “concurrency_groups” for serve deployments?

Correct, concurrency_groups is not a Serve deployment config. Ideally you can separate these two methods into two separate deployments. And then you can use model composition to pass the responses. See this doc for more details Deploy Compositions of Models — Ray 2.35.0

1 Like