High: It blocks me to complete my task.
I am running a head node and worker node manually in a kubernetes cluster
I based my yamls on ray/ray-cluster.yaml at master · ray-project/ray · GitHub
Then I run ray.init and ray.nodes in another container in my kubernetes cluster
the issue is that sometimes ray.nodes will only return the head node and if I do sleep and try again it will return the ray worker node as well
My guess is that I need to setup startup/readiness/liveness probe so I will be able to tell when all ray nodes are ready and only then do ray.init and ray.nodes
I can’t find any documentation on the subject so I am not sure how to configure the yamls.
The reason I am using ray.nodes is because of this: