[RLlib] "target_q_model" not using GPU for custom model

I has set “num_gpus” as 1, “policy.model” is using GPU but “policy.target_q_model” not using GPU when called from “build_q_losses” method in simple_q_torch_policy.py

Traceback (most recent call last):
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/tune/function_runner.py", line 248, in run
  self._entrypoint()
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/tune/function_runner.py", line 316, in entrypoint
  self._status_reporter.get_checkpoint())
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/tune/function_runner.py", line 580, in _trainable_func
  output = fn()
File "experiment.py", line 33, in experiment
  train_agent = SimpleQTrainer(config=config, env=HierarchicalGraphColorEnv)
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 123, in __init__
  Trainer.__init__(self, config, env, logger_creator)
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 584, in __init__
  super().__init__(config, logger_creator)
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/tune/trainable.py", line 103, in __init__
  self.setup(copy.deepcopy(self.config))
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 731, in setup
  self._init(self.config, self.env_creator)
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 152, in _init
  num_workers=self.config["num_workers"])
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 819, in _make_workers
  logdir=self.logdir)
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/rllib/evaluation/worker_set.py", line 103, in __init__
  spaces=spaces,
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/rllib/evaluation/worker_set.py", line 431, in _make_worker
  spaces=spaces,
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 557, in __init__
  policy_dict, policy_config)
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1342, in _build_policy_map
  policy_map[name] = cls(obs_space, act_space, merged_conf)
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/rllib/policy/policy_template.py", line 281, in __init__
  stats_fn=stats_fn,
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/rllib/policy/policy.py", line 691, in _initialize_loss_from_dummy_batch
  self._loss(self, self.model, self.dist_class, train_batch)
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/rllib/agents/dqn/simple_q_torch_policy.py", line 91, in build_q_losses
  is_training=True)
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/rllib/agents/dqn/simple_q_tf_policy.py", line 185, in compute_q_values
  }, [], None)
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/ray/rllib/models/modelv2.py", line 231, in __call__
  res = self.forward(restored, state or [], seq_lens)
File "/home/cs20mtech12003/ML-Register-Allocation/model/RegAlloc/ggnn_drl/rllib_split_model/src/model.py", line 138, in forward
  x = F.relu(self.fc1(input_dict["obs"]["state"]))
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
  result = self.forward(*input, **kwargs)
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 93, in forward
  return F.linear(input, self.weight, self.bias)
File "/home/cs20mtech12003/anaconda3/envs/rllib_env/lib/python3.7/site-packages/torch/nn/functional.py", line 1690, in linear
  ret = torch.addmm(bias, input, weight.t())
RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for addmm)
1 Like

Hey @Siddharth_Jain , could you check the latest master version? We fixed this issue recently, some PRs ago :slight_smile:

We are also now running nightly multi-GPU learning tests for all major algos (incl. DQN/SimpleQ) and both tf and torch, making sure everything runs fine on a 2GPU machine.

We’ll add LSTM-based tests to these as well (for the RNN-supporting algos, like PPO) in the next 2 weeks.