How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
After following the 60seconds RLib guide I am unable to see any worker nodes being utilized on the Kubernetes Ray cluster. The only node that is utilized is the head node. However, if I shift the config to num_workers=0 and remote_worker_envs=True, all of the CPUs fire up but this doesn’t seem like what I want (?). My multi-node cluster consists of 5 nodes. I am hoping to maximize one of the examples for testing purposes on the cluster and see how long it takes to train. Am I missing something?
If I manually connect to Ray and run the object transfer example this also works so I think this is some sort of RLlib config I am unaware of.