Relationship between RayService and RayCluster

Previously, my understanding was that a ray service instantiates one RayCluster and all compute is carried out through head and worker nodes on that ray cluster.

In practice, however, I’ve seen that a RayService may kill a ray cluster (for whatever reason) and then create a new ray cluster.

How often is it expected that a rayservice kills a ray cluster? I’m having issues deploying and hence a new ray cluster gets spun up, but are there any other cases where this happens?

cc @Kai-Hsun_Chen @architkulkarni I assuming this is in the context of KubeRay setting up a RayService in the CR

Thanks for the quick reply @Jules_Damji. I was mistaken - re-creating a raycluster once the service is unhealthy for the duration of serviceUnhealthySecondThreshold is a feature of a ray service.

@alfredh So are we sorted then? If so can we close this issue?

1 Like