Control over Plasma Storage/Scheduling

I deal with objects of varying sizes (between a few MB and many GB). When the objects are small I don’t mind them being moved between nodes, when they are large I want them to remain on a single node unless there is a severe lack of resources.

What would be useful is a way to “control” which nodes objects are sent to, and which nodes jobs should be prioritised on.

Eg if I have

Node 1 - 10 CPUs, 5gb plasma
Node 2 - 5 CPUs, 3gb plasma

and I have let’s say 3 objects which each need 3 different numbers of operations on:
Obj1 - 10MB needs about 20 jobs which take 5 - 10 seconds each
Obj2 - 50MB needs about 20 jobs which take 30 seconds each
Obj3 - 3GB needs about 100 jobs which take 1min each

Now ideally what I could do is something like this:

ray.put(Obj1, node1)
ray.put(Obj2, node1)
ray.put(Obj3, node1)
ray.put(Obj3, node2) # saturates node 2

for job in obj1_jobs:
    job.remote(node1)
for job in obj2_jobs:
   job.remote(node1)
for job in obj3_jobs:
   job.remote([node1, node2])

and let the scheduler figure out the rest.

I am new to distributed computing so I’m unsure if what I’m saying just hints at a bad architecture.

In Ray, the recommended solution is to use the placement group. This ensures tasks or actors that are started with the same placement group will be “placed” on a node that you specified. For your case, you can use the STRICT_PACK strategy. Placement Groups — Ray v1.2.0

What you can do is to submit tasks/actors that uses the same object to the same placement group with STRICT_PACK or PACK policy. This will ensure that objects will be used by tasks/actors that are scheduled in a packed nodes.

Cheers this looks like what I’m looking for. I’ll give it a go!

1 Like