Run DD-PPO in multiple GPUs

Hi all!

I am trying to run the DD-PPO algorithm, providing 1 GPU/worker, using this configuration file:
https://github.com/ray-project/ray/blob/master/rllib/tuned_examples/ppo/atari-ddppo.yaml
I have a machine with 4 GPUs, thus I set num_workers: 4.

The problem is, that all 4 workers use 1 GPU, instead of using 1 each.
When I print the resources for the workers, I get the following output for each of them:

{‘CPU_group_05d7394261ee785f26194bc7e08bb664’: [(0, 1.0)], ‘GPU_group_05d7394261ee785f26194bc7e08bb664’: [(0, 1.0)]}

I understand that Ray then sets CUDA_VISIBLE_DEVICES=0 for all workers, and this might be why they all use GPU 0.

Why is the GPU_group showing always GPU 0, for each worker? Is this expected behavior?

Thanks!

Yes, this is expected behavior. It is the models that get distributed across GPUs.

Then, how to distribute multiple workers to multiple GPUs?