[Medium]
For our production Ray serve application, we’d like to perform gradual rollouts to ensure new code is safe. Currently Kuberay performs a complete cutover after confirming the new deployment is good.
An example strategy would be to deploy 10% of traffic for 1 hour and progress to 30%, etc.
Our current solution is to use an external AWS Load Balancer with Target Group to control traffic percentages with Blue/Green cluster setup.
We’ve also explored Argo Rollouts, however the RayService
is a CRD and Rollouts is limited to only handling kubernetes deployments
.
Additions to the Kuberay documentation toward more complex deployment strategies would be appreciated as well.
Testing on Ray 2.4.0 and Kuberay nightly