How does the scheduler "decide" to send jobs based on resources/plasma storage?

alexisdrakopoulos · April 5, 2021, 1:33pm

Let’s say I have a cluster of 2 nodes with 10 CPUs each and 16gb of plasma storage.

Let’s say node 1 has obj1 of size 15gb and 6 cores in use, and node 2 has obj2 of size 1gb< and 5< cores in use.

Let’s say I have a job that needs to operate on obj1 and uses 5 CPU cores. Does Ray wait for the jobs on node 1 to finish before sending the job since the 15gb object is already written there? Does Ray have the ability to run it on node 2?

I’m not sure how this all works. I have a workload that has painfully slow objects to load into plasma and I want to store them across nodes, with jobs that need those objects running on those nodes unless there is an abnormally large queue at which point I’ll add the object it needs to an available node. I’m brand new to Ray, would I do this manually by keeping track of object locations, resource availability and then moving objects around?

eoakes · April 5, 2021, 4:15pm

Hey @alexisdrakopoulos currently the Ray scheduler makes scheduling decisions first based on resource availability, but also does best-effort data locality. So in this case Ray will try to place the tasks where the data is, but if there are available resources elsewhere it will schedule the tasks there.

@Clark_Zinzow may be able to give you more information here; it’s an area of active development.

Topic		Replies	Views
Control over Plasma Storage/Scheduling Ray Core	2	338	April 8, 2021
How do Tasks get scheduled on a Head Node with CPU=0? Ray Core	1	784	January 10, 2022
Data Locality mentioned in Jules Damji talks Ray Core	2	115	April 11, 2024
Does ray use the least-busy worker? Ray Core	1	257	July 15, 2022
Scheduling with data affinity Ray Core	2	393	March 25, 2021

How does the scheduler "decide" to send jobs based on resources/plasma storage?

Related topics