Does Ray Serve support local model hot update/reload?

That is, I can smoothly update the deployed models to the latest version without having to stop the online service. And supports rollback operations

Just like Tensorflow Serving

Hi Japson,

Currently ray serve doesn’t have the functionality to support the model load/rollback/version control.

(To unblock you) For new version model update, can you directly trigger another deployment (new model) and shit traffic and remove the previous deployment (old model)?

Hi @Japson, as @Sihan_Wang mentioned, you can deploy your updated model to new Serve deployments on a fresh Ray cluster, and then you can shift traffic to it once the deployments are running.

Additionally, Serve does a rolling update when you update your live deployments on existing Ray cluster. In this case

  1. Serve tears down some replicas from your old deployment.
  2. Serve replaces them with replicas from your new deployment.
  3. Serve repeats this with some more replicas from your old deployment until all replicas have updated.

During this time, your service is still live, but some requests may be handled by outdated replicas. This might also be a viable option.