Data Locality mentioned in Jules Damji talks

How severe does this issue affect your experience of using Ray?

  • None: Just asking a question out of curiosity

Greetings. I listened to several interesting talks of Jules available in youtube where he mentions that Ray understands data locality and tries to schedule a task on a specific node (Lets say N2) if the object required (Lets say X) for the task resides on the object store of N2. On this aspect, i have some naive queries.

  1. In case the node N2 is busy, does Ray waits till it becomes available or automatically schedules the task on another node (Lets say N5) whose workers, lets assume, are free.

  2. In Ray article (https://www.usenix.org/system/files/osdi18-moritz.pdf), text for Fig.7 highlights that a task scheduled on a specific node can fetch the object required for the task from another node if it is not available on the node’s local object store. I am wondering, how does this happen. Does Ray uses any specific IPC for this inter-node data transfer, i am not getting it . Am i missing something ?

  3. I am using Ray on HPC with Slurm. When i use dask-mpi on HPC, i specifically set a parameter (interface=‘ib0’) for infiniband. Is there any such parameter that i should set for Ray when used on HPC ?.

  4. Could you please any documentation relevant to the above questions, especially inter-node communications.

Thanks.

Hey Bala, welcome to the Ray community. Have you given the Ray 2.0 whitepaper a read: Architecture Whitepapers — Ray 2.10.0 < should have answers to most of your questions.

Re 2) it’s over RPC I believe.

Have to think a bit for your remaining questions.

  1. Yes the task will get scheduled to another node if it’s free first.
  2. Yes, RPC. By the way, this paper is a bit out of date, and agree with @Sam_Chan that you should check out the whitepaper for up-to-date information.
  3. Currently we don’t have special support for Infiniband. All data transfer will happen over TCP.