[Cluster, Serve] Is it possible to configure cluster fault tolerance without `ray up`?

I have configured the cluster via manual ray start in docker containers on remote servers. I have the head node in k8s with the same startup principle (ray start).
But when the head node is refreshed all information about running applications is overwritten and applications are not restored.

Is it possible to configure cluster fault tolerance without ray up?
Is it possible to fix this without using KubeRay and ray up?
If not, what exactly is the difference between KubeRay and regular k8s? Maybe it is possible to customize it without KubeRay directives.

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Versions

  • python == 3.11.5
  • ray == 2.8.1