I’m new to Ray and still at learner’s stage. Started working on some example codes to check task distribution.
I tried my testing 2 way.
- Started ray head node manually.
I could see on dashboard 2 cores and no usage. After this I fired my example python work script and I could see 100% CPU utilization.
Then I from other local system, I connected to Ray’s head node. Immediately I can see second node on dashboard, but due to some reason no work is assigned to this new node. I tried connecting other few nodes, which all shows up as available nodes but no work is getting asssigned to them. Their CPU load stays at 2-4% while the main node is hitting 100%.
What am I doing wrong?
- Started ray local cluster.( Not planning to let ray handle scalling)
Exact same situation as above.
Why task is not distributed to new nodes connected to ray cluster/main head node?
If nodes are connected first and then I run the work script, task is distributed to all nodes equally.
But my requirement is to turn up nodes if/when required.
Isn’t that how it’s suppose to work? If I connect a node to master( head node), he should start using new nodes to whatever work is left( and there is plenty left).
Sorry, I’m new to Ray and couldn’t find online what am I doing wrong.