Then I run ray.init and ray.nodes in another container in my kubernetes cluster
the issue is that sometimes ray.nodes will only return the head node and if I do sleep and try again it will return the ray worker node as well
My guess is that I need to setup startup/readiness/liveness probe so I will be able to tell when all ray nodes are ready and only then do ray.init and ray.nodes
I can’t find any documentation on the subject so I am not sure how to configure the yamls.
The reason I am using ray.nodes is because of this: