That is, I can smoothly update the deployed models to the latest version without having to stop the online service. And supports rollback operations
Just like Tensorflow Serving
That is, I can smoothly update the deployed models to the latest version without having to stop the online service. And supports rollback operations
Just like Tensorflow Serving
Hi Japson,
Currently ray serve doesn’t have the functionality to support the model load/rollback/version control.
(To unblock you) For new version model update, can you directly trigger another deployment (new model) and shit traffic and remove the previous deployment (old model)?
Hi @Japson, as @Sihan_Wang mentioned, you can deploy your updated model to new Serve deployments on a fresh Ray cluster, and then you can shift traffic to it once the deployments are running.
Additionally, Serve does a rolling update when you update your live deployments on existing Ray cluster. In this case
During this time, your service is still live, but some requests may be handled by outdated replicas. This might also be a viable option.