- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
Ray 2.4.0, python 3.10, cluster on GCP
I would like to update serve deployments without restarting cluster. But I met the following issue.
When I tried to update code (Set another runtime_env.working_dir config parameter value and do serve deploy config.json
), serve status says that deployments are updating:
name: default
app_status:
status: DEPLOYING
message: ‘’
deployment_timestamp: 1686651268.6050997
deployment_statuses:
- name: default_DLModelProcessor
status: HEALTHY
message: ‘’ - name: default_Backend
status: UPDATING
message: ‘’
but when I do a request to serve application a cycle of node launchings and terminatings appeared, as DLModelProcessor deployment failed: ModuleNotFoundError: No module named \'holistic_inference\'\n'
I thinks, that on a new node new version of working_dir is downloaded, while on the head node we have the outdated one.
How can I update the code of deployments without cluster’s restarting?