Trials placed on the same GPU on a 2 GPU machine despite "num_gpus": 1

If I run the following experiment Ray places the models on the same GPU. Is this expected?

import gym
import numpy as np
import ray
from gym.spaces import Box, Discrete
from ray import tune


class SimpleCorridor(gym.Env):
    def __init__(self, config):
        self.end_pos = config["corridor_length"]
        self.cur_pos = 0
        self.action_space = Discrete(2)
        self.observation_space = Box(0.0,
                                     self.end_pos,
                                     shape=(1, ),
                                     dtype=np.float32)

    def reset(self):
        self.cur_pos = 0
        return [self.cur_pos]

    def step(self, action):
        assert action in [0, 1], action
        if action == 0 and self.cur_pos > 0:
            self.cur_pos -= 1
        elif action == 1:
            self.cur_pos += 1
        done = self.cur_pos >= self.end_pos
        return [self.cur_pos], 1.0 if done else -0.1, done, {}


if __name__ == "__main__":
    ray.init()

    config = {
        "env": SimpleCorridor,
        "env_config": {
            "corridor_length": 5,
        },
        "num_gpus": 1,
        "lr": tune.grid_search([1e-4, 1e-4]),
        "num_workers": 2,
        "num_envs_per_worker": 1,
        "framework": "torch",
    }

    stop = {"training_iteration": 10}
    results = tune.run('PPO', config=config, stop=stop, verbose=1)

The reported resource reqs:

Resources requested: 8.0/24 CPUs, 2.0/2 GPUs, 0.0/17.1 GiB heap, 0.0/8.55 GiB objects (0.0/1.0 accelerator_type:GTX)

Nvidia-smi:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.102.04   Driver Version: 450.102.04   CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:0A:00.0 Off |                  N/A |
|  0%   55C    P2    99W / 250W |   1571MiB / 11178MiB |     63%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:41:00.0 Off |                  N/A |
|  0%   46C    P8    13W / 250W |     14MiB / 11176MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                                
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1581      G   /usr/lib/xorg/Xorg                  4MiB |
|    0   N/A  N/A   1528882      C   ray::PPO.train_buffered()         781MiB |
|    0   N/A  N/A   1528905      C   ray::PPO.train_buffered()         781MiB |
|    1   N/A  N/A      1581      G   /usr/lib/xorg/Xorg                  8MiB |
|    1   N/A  N/A      1898      G   /usr/bin/gnome-shell                3MiB |
+-----------------------------------------------------------------------------+

Python: 3.8
Ray: 87c79553e94
Os: Ubuntu 20.04

Hmm this doesn’t look right to me. They should be placed on different GPUs. @sven1977 could you please confirm that there isn’t a configuration issue here?

You are using grid_search, so I suspect tune is running both trials at once. Remove the grid_search and you should only see one train process.

Hi @vakker00,

Does it run on seperate gpus if you run it like this?

results = tune.run('PPO', config=config, stop=stop, verbose=1, resources_per_trial={​"gpu": 1})

Also, instead of grid_searching the lr like that, if you want two independent training runs, you can pass in num_samples=2.

I installed Ray from the latest build and I couldn’t reproduce the issue any more. I’m not sure what caused it.