I have a question about changing the gpu allocation order when using ray tune. For example, Ray tune allocates 2 trials to gpu 0 and then to gpu 1 when I set resource_per_trial gpu to 0.5. However, I want to assign trials to gpu 0, 1 and 2 each. Further, when I create a trial, I want to assign it to the specific gpu (i.e., assign first trial to gpu 2, not 0). Please let me know if there is a way to do this.
Hey @mjs , thanks for making this post!
Ray uses CUDA_VISIBLE_DEVICES to do resource allocation. So if you want to assign a trial to a specific GPU, you’ll need to manually set CUDA_VISIBLE_DEVICES by yourself.
What type of allocation pattern do you have specifically?
I am using 4 GPUs. And I have 4 data sets A, B, C and D. I think of the allocation pattern as below.
“Trial 1, 2, 3, 4” to train a neural network composed of hyperparamters F are run simultaneously on gpus 0, 1, 2, and 3. Then, I want to train the network using data set A on gpu 0, data set B on gpu 1, data set C on gpu 2, and data set C on gpu 3.
At the same time, I want to create additional hyperparamters called G and H on each GPU as above.
Therefore, each gpu must allocate 0.33 resources for one hyperparamter trial.
“F” trials are executed concurrently because I want to
Data set A —> GPU 0 ----> “F” trial 0 ----> metric for dataset A -----> aggregate metric A, B, C, D
Data set B —> GPU 1 ----> “F” trial 1 ----> metric for dataset B
Data set C —> GPU 2 ----> “F” trial 2 ----> metric for dataset C
Data set D —> GPU 3 ----> “F” trial 3 ----> metric for dataset D
GPU 0 has “F” trial 0, “G” trial 0, “H” trial 0.
GPU 1 has “F” trial 1, “G” trial 1, “H” trial 1.
GPU 2 has “F” trial 2, “G” trail 2, “H” trial 2.
GPU 3 has “F” trial 3, “G” trial 3, “H” trial 3.
Therefore, each GPU allocates 0.33 of its memory to each “–” trial.
I think you should probably use Ray Core for this. Are “F”, “G”, and “H” different values of the same parameter?
Trials 0,1,2 and 3 of F have the same parameters. However, I consider the case where F, G and H are all different parameters.