How severe does this issue affect your experience of using Ray?
- None: Just asking a question out of curiosity
Greetings. I listened to several interesting talks of Jules available in youtube where he mentions that Ray understands data locality and tries to schedule a task on a specific node (Lets say N2) if the object required (Lets say X) for the task resides on the object store of N2. On this aspect, i have some naive queries.
-
In case the node N2 is busy, does Ray waits till it becomes available or automatically schedules the task on another node (Lets say N5) whose workers, lets assume, are free.
-
In Ray article (https://www.usenix.org/system/files/osdi18-moritz.pdf), text for Fig.7 highlights that a task scheduled on a specific node can fetch the object required for the task from another node if it is not available on the node’s local object store. I am wondering, how does this happen. Does Ray uses any specific IPC for this inter-node data transfer, i am not getting it . Am i missing something ?
-
I am using Ray on HPC with Slurm. When i use dask-mpi on HPC, i specifically set a parameter (interface=‘ib0’) for infiniband. Is there any such parameter that i should set for Ray when used on HPC ?.
-
Could you please any documentation relevant to the above questions, especially inter-node communications.
Thanks.