How to let actor inherit resource from its owner?

Icarus · January 11, 2021, 2:53am

Hi everyone, I have a questions about resource management in ray.

In my application, I need to run a deep learning training actor on each GPU, and periodically run some tests on the learned model. It should be the normal pipeline in deep learning. In my case, some of the tests need to use GPU, but only limited resources. I want use ray to run them in parallel to speed up the test. However, in order to run the test as a ray task, I need to schedule new resources for it.

My current practice is, say I have two tests that need GPUs, I will schedule the training actor with 0.8 GPUs and each tests with 0.1 GPUs. It works but definitely not elegant, because it is not flexible for the number of tests and may schedule tasks on device that is different with the device of the training actor.

One thought is, since the training actor is the owner of the test tasks, can they just inherit the resource of its owner? I think in this way, the program will be more flexible and more understandable as it perform like multiprocessing on a local machine.

Alex · January 11, 2021, 10:01pm

You can use placement groups for this. Create a placement group with 3 bundles, then assign the training actor to one bundle, and the tasks to the other bundles.

https://docs.ray.io/en/master/placement-group.html#placement-groups

Topic		Replies	Views
Where to instantiate named actors for global coordination RLlib	0	271	January 12, 2022
Ray IP-based scheduling using placement group hang Ray Core	6	393	July 16, 2021
Shared Resource Manager	11	505	March 17, 2023
How to: ensure actor is running on the same node only? Ray Core	13	1957	May 13, 2021
How to assign a specific actor to a specific GPU Ray Core	15	1604	February 16, 2021

How to let actor inherit resource from its owner?

Related topics