Ray for HPC domain and Legion Programming System comparison

brombaut · January 13, 2023, 6:31pm

How severe does this issue affect your experience of using Ray?

None: Just asking a question out of curiosity

I’m curious if anyone has experience with applying Ray to the HPC domain (maybe they’ve encountered problem areas in Ray’s design or where Ray’s performance falls short for HPC jobs versus its intended workloads from the ML/AI domain), and if there are any thoughts on how Ray compares in terms of design to Legion (https://legion.stanford.edu/), which seems like a very similar system to Ray, but specifically targeted at the HPC domain.

For a similar feature example, Ray has the ability to cache scheduling decisions to amortize scheduling RPC overhead for similar tasks. Legion has a similar idea of index space tasks to efficiently launch a large number of non-interfering tasks.

Legion also has this idea of Dynamic Tracing (see the paper https://legion.stanford.edu/pdfs/trace2018.pdf), which is essentially JIT compiling the task graph (quote from the paper: “…dynamic tracing, a technique to efficiently and correctly memoize a dynamic dependence analysis and generate a task graph semantically equivalent to (but also often syntactically different from) the original.”) I haven’t been able to find a similar idea that has been applied to Ray (of course, this might be because I haven’t fully understood how Ray goes about dynamically generating it’s task graph, or whether this is really even an issue for Ray where this idea of “Dynamic Tracing” could be applied, just an example I’m trying to come up with for comparison).

Would love to hear if anyone else has thoughts on these topics.

Topic		Replies	Views
Does Ray support multi node message passing like MPI? If so does it support HPC schedulers like Slurm or PBS? Ray Core	0	385	August 23, 2021
Question about Ray for HPC Ray Core	2	1335	March 21, 2021
Ray Scheduler vs Spark Scheduler - Architecture understanding? Ray Core	1	764	May 5, 2022
Question about resource management in Ray Ray Core	22	2015	April 24, 2021
CPU cores, CPU threads, and scaling of Ray tasks Ray Core	1	305	June 25, 2024

Ray for HPC domain and Legion Programming System comparison

Related topics