RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_addmm)

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hi everyone,

I am stuck trying to debug a problem while training a PPO agent with a custom environment and would appreciate any help.

(PPOTrainer pid=5252) 2022-08-05 12:03:54,860	ERROR worker.py:451 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::PPOTrainer.__init__() (pid=5252, ip=129.132.4.157, repr=PPOTrainer)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/agents/trainer.py", line 1074, in _init
(PPOTrainer pid=5252)     raise NotImplementedError
(PPOTrainer pid=5252) NotImplementedError
(PPOTrainer pid=5252) 
(PPOTrainer pid=5252) During handling of the above exception, another exception occurred:
(PPOTrainer pid=5252) 
(PPOTrainer pid=5252) ray::PPOTrainer.__init__() (pid=5252, ip=129.132.4.157, repr=PPOTrainer)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/agents/trainer.py", line 870, in __init__
(PPOTrainer pid=5252)     super().__init__(
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/tune/trainable.py", line 156, in __init__
(PPOTrainer pid=5252)     self.setup(copy.deepcopy(self.config))
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/agents/trainer.py", line 950, in setup
(PPOTrainer pid=5252)     self.workers = WorkerSet(
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/evaluation/worker_set.py", line 170, in __init__
(PPOTrainer pid=5252)     self._local_worker = self._make_worker(
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/evaluation/worker_set.py", line 630, in _make_worker
(PPOTrainer pid=5252)     worker = cls(
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 630, in __init__
(PPOTrainer pid=5252)     self._build_policy_map(
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1788, in _build_policy_map
(PPOTrainer pid=5252)     self.policy_map.create_policy(
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/policy/policy_map.py", line 152, in create_policy
(PPOTrainer pid=5252)     self[policy_id] = class_(observation_space, action_space, merged_config)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/agents/ppo/ppo_torch_policy.py", line 59, in __init__
(PPOTrainer pid=5252)     self._initialize_loss_from_dummy_batch()
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/policy/policy.py", line 904, in _initialize_loss_from_dummy_batch
(PPOTrainer pid=5252)     actions, state_outs, extra_outs = self.compute_actions_from_input_dict(
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/policy/torch_policy.py", line 335, in compute_actions_from_input_dict
(PPOTrainer pid=5252)     return self._compute_action_helper(
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/utils/threading.py", line 21, in wrapper
(PPOTrainer pid=5252)     return func(self, *a, **k)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/policy/torch_policy.py", line 997, in _compute_action_helper
(PPOTrainer pid=5252)     dist_inputs, state_out = self.model(input_dict, state_batches, seq_lens)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/models/modelv2.py", line 259, in __call__
(PPOTrainer pid=5252)     res = self.forward(restored, state or [], seq_lens)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/models/torch/complex_input_net.py", line 201, in forward
(PPOTrainer pid=5252)     nn_out, _ = self.flatten[i](
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/models/modelv2.py", line 259, in __call__
(PPOTrainer pid=5252)     res = self.forward(restored, state or [], seq_lens)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/models/torch/fcnet.py", line 146, in forward
(PPOTrainer pid=5252)     self._features = self._hidden_layers(self._last_flat_in)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
(PPOTrainer pid=5252)     return forward_call(*input, **kwargs)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/torch/nn/modules/container.py", line 139, in forward
(PPOTrainer pid=5252)     input = module(input)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
(PPOTrainer pid=5252)     return forward_call(*input, **kwargs)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/ray/rllib/models/torch/misc.py", line 164, in forward
(PPOTrainer pid=5252)     return self._model(x)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
(PPOTrainer pid=5252)     return forward_call(*input, **kwargs)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/torch/nn/modules/container.py", line 139, in forward
(PPOTrainer pid=5252)     input = module(input)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
(PPOTrainer pid=5252)     return forward_call(*input, **kwargs)
(PPOTrainer pid=5252)   File "/home/sem22h2/.conda/envs/RL/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
(PPOTrainer pid=5252)     return F.linear(input, self.weight, self.bias)
(PPOTrainer pid=5252) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_addmm)
2022-08-05 12:03:54,875	ERROR trial_runner.py:886 -- Trial PPO_NasBench201_e466b_00000: Error processing event.

I start the training with the following code, where NasBench201 is my custom environment.

ray.init(ignore_reinit_error=True)
ray.tune.run(
    "PPO",
    stop={"training_iteration": 100},
    config={
        "env": NasBench201,
        "framework": "torch",
        "num_cpus_per_worker": 1,
        "log_level": "INFO",
        "horizon":1000,
        "num_gpus": 1,
        "num_workers": 1,
        "render_env": False
    },
    # local_dir="logs",
    callbacks=[WandbLoggerCallback(api_key="xxxxx", project="RayNasBenchV0")],
)

I also stumbled upon this issue here [Bug] RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_addmm) · Issue #21921 · ray-project/ray · GitHub. However, this should have been fixed. Furthermore, I am not using a multi-dimensional action-/observation- space.
Custom Env

no_ops = 4
num_triu  = 6
self.observation_space = spaces.MultiBinary(no_ops * num_triu)
self.action_space = spaces.Discrete(no_ops * num_triu)

I am using ray 1.13.0. Any input would be highly appreciated. I am not even sure how to start debugging this issue. What I did till now is only try to comment out a few lines in my custom environment which might be causing this.

Best regards,
Dan

1 Like

Hi @daniekie, I have tried something similar with the master branch and I could not repro your issue. Can you use the DEFAULT_DICT and modify that to match your configs? I feel like that could be issue in 1.13. Here is my code that works on both master or 2.0.0rc0.


import gym
import gym.spaces as spaces

import ray
from ray import tune
from ray.rllib.algorithms.ppo import PPO, PPOConfig

class Env(gym.Env):
    def __init__(self, config) -> None:
        no_ops = 4
        num_triu  = 6
        super().__init__()
        self.observation_space = spaces.MultiBinary(no_ops * num_triu)
        self.action_space = spaces.Discrete(no_ops * num_triu)
    
    def reset(self):
        return self.observation_space.sample()
    
    def step(self, action):
        return self.observation_space.sample(), 0, False, {}

config = (
    PPOConfig()
    .framework("torch")
    .resources(num_gpus=1)
    .environment(env=Env)
)

tune.run(
    PPO,
    config=config.to_dict()
)
1 Like

Hi @kourosh ,

thank you for your reply!

I tried running your code with my custom environment but wasn’t able to as I got the following error on the import of the PPOConfig

ModuleNotFoundError: No module named 'ray.rllib.algorithms'

I installed ray and ray rllib with pip. I already tried reinstalling ray, however, I ended up with the same error. Here are some details about my pip installation (pip show --verbose ray).

Name: ray
Version: 1.13.0
Summary: Ray provides a simple, universal API for building distributed applications.
Home-page: https://github.com/ray-project/ray
Author: Ray Team
Author-email: ray-dev@googlegroups.com
License: Apache 2.0
Location: /scratch2/sem22hs2/.conda/envs/RL/lib/python3.10/site-packages
Requires: aiosignal, attrs, click, filelock, frozenlist, grpcio, jsonschema, msgpack, numpy, protobuf, pyyaml, requests, virtualenv
Required-by:
Metadata-Version: 2.1
Installer: pip
Classifiers:
  Programming Language :: Python :: 3.6
  Programming Language :: Python :: 3.7
  Programming Language :: Python :: 3.8
  Programming Language :: Python :: 3.9
Entry-points:
  [console_scripts]
  ray = ray.scripts.scripts:main
  ray-operator = ray.ray_operator.operator:main
  rllib = ray.rllib.scripts:cli [rllib]
  serve = ray.serve.scripts:cli
  tune = ray.tune.scripts:cli

Project-URLs:

Do you have an idea what might be the problem? Should I try building from source?

Thanks for your help! :smile:

@daniekie the ray.rllib.algorithms module does not yet exist in the version 1.13.0. Either you have to use the ray.rllib.agents module that is available in your version or install the master by installing the nightly.

If you decide to keep 1.13.0 you should import from ppo.py the DEFAULT_CONFIG and fill this dictionary on the more traditional way :slight_smile:

EDIT: I just read that @kourosh was referring to either the master (3.0.0) or rc2.0.0. So you should definitely update to one of these two versions.

1 Like

Thanks a lot, guys this solved the issue. Much appreciated! :smiley:

In the end, I installed the master and imported the DEFAULT_CONFIG from ray.rllib.algorithms

1 Like