Preserving GCP VMs After "ray down" for reuse on subsequent "ray up"

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

After running ray down, all the VMs in my cluster are deleted, including the head and worker nodes. This behavior is expected, as ray down is designed to tear down the cluster. However, upon the next ray up, new VMs are created from scratch, including pulling Docker images, which introduces significant delays due to the time required to set up the environment anew.

In contrast, when I used Ray on AWS, the VMs were not deleted after running ray down, allowing for a much quicker startup on the next ray up since the Docker images and environment setup were preserved.

My question is: Is there a way to configure Ray on GCP to preserve VMs after executing ray down, so that they can be reused on subsequent ray up commands? This would ideally allow for bypassing the Docker image pulling and environment setup processes, significantly reducing startup times.

Can you cut a Github for this; probably somewhere in the Ray Core cluster launcher code path where we have this divergent logic…

I registered an issue on GitHub 3 days ago regarding this matter, but there hasn’t been any response yet.