Hey everyone, it seems to me that the centralized critic example provided in Rllib does not support multiple GPU, as I copy-pasta the central_value_function
and gets the following error telling that the input to the central_value_function
is not on the same CUDA device with the weight tensors:
File "/data/USER/Projects/mate/nips_rllib/policy/ccppo_imrl_policy.py", line 141, in loss_with_central_critic_and_ImRL
policy._central_value_out = model.value_function()
File "/data/USER/Projects/mate/nips_rllib/policy/ccppo_imrl_policy.py", line 139, in <lambda>
train_batch[TEAM_ACTION])
File "/data/USER/Projects/mate/nips_rllib/policy/ccppo_imrl_policy.py", line 264, in central_value_function
return torch.reshape(self.central_vf(input_), [-1])
File "/home/USER/Miniconda3/envs/mate/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/USER/Miniconda3/envs/mate/lib/python3.7/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/home/USER/Miniconda3/envs/mate/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/USER/Miniconda3/envs/mate/lib/python3.7/site-packages/ray/rllib/models/torch/misc.py", line 160, in forward
return self._model(x)
File "/home/USER/Miniconda3/envs/mate/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/USER/Miniconda3/envs/mate/lib/python3.7/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/home/USER/Miniconda3/envs/mate/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/USER/Miniconda3/envs/mate/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 94, in forward
return F.linear(input, self.weight, self.bias)
File "/home/USER/Miniconda3/envs/mate/lib/python3.7/site-packages/torch/nn/functional.py", line 1755, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: Expected tensor for 'out' to have the same device as tensor for argument #2 'mat1'; but device 0 does not equal 1 (while checking arguments for addmm)
The program runs without error if I set num_gpus
to 1, so I guess the value_function is on device cuda:0 while other sample_batches are on different devices.
def linear(input: Tensor, weight: Tensor, bias: Optional[Tensor] = None) -> Tensor:
...
if input.device != weight.device:
print(input.device, weight.device)
...
>>> (pid=103090) cuda:1 cuda:0
>>> (pid=103090) cuda:2 cuda:0
>>> (pid=103090) cuda:3 cuda:0