I’m trying to run the Qmix code with my own environment, I’m having problems defining the observation and action space. An error with the Structure:
First structure: type=ndarray str=[ 0 0 0 -1 -1 0 0 0 0 0]
Second structure: type=tuple str=(array([53, 37, 5, 7, 6, 53, 58, 52, 59, 52], dtype=int32),)
I am defining them as follows, similar to the 2step game.
In my environment class I have this:
obs_space = [self.max_episode_length*2 + 1]*obs_shape
self.observation_space = gym.spaces.MultiDiscrete(obs_space)
t = [*range(self.num_tech + 1)]
c = [*range(self.num_com + 1)]
self.list_actions = list(itertools.product(t, c))
self.action_space = gym.spaces.Discrete(len(self.list_actions))
So basically obs is MultiDiscrete and the action is Discrete, then in my trainer class I define the callable function to register my env
register_env(env_name, env_creator)
like this
def env_creator(_):
grouping = {
"group_1": [str(m.id) for m in machines]
}
F = FEnv(m)
obs_space = Tuple([F.observation_space])
act_space = Tuple([F.action_space])
return F.with_agent_groups(grouping, obs_space, act_space)
When I run it I get the following error:
(MARL) 0 [mruiz@iris-075 MARLPdM2](2603757 1N/T/1CN)$ python trainqmix.py -c config_13 -g 0 -w testqmix -t 60 -e 0 -x .001
/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/tune/callback.py:217: FutureWarning: Please update `setup` method in callback `<class 'ray.tune.integration.wandb.WandbLoggerCallback'>` to match the method signature in `ray.tune.callback.Callback`.
warnings.warn(
wandb: Currently logged in as: marceloruiz (use `wandb login --relogin` to force relogin)
wandb: WARNING Tried to auto resume run with id ea4df_00000 but id c732e_00000 is set.
wandb: Tracking run with wandb version 0.12.9
wandb: Syncing run QMIX_FactoryEnv_c732e_00000
wandb: View project at https://wandb.ai/marceloruiz/QMIX
wandb: View run at https://wandb.ai/marceloruiz/QMIX/runs/c732e_00000
wandb: Run data is saved locally in /mnt/irisgpfs/users/mruiz/MARLPdM2/wandb/run-20220130_192354-c732e_00000
wandb: Run `wandb offline` to turn off syncing.
(QMIX pid=477) 2022-01-30 19:24:03,729 INFO dqn.py:141 -- In multi-agent mode, policies will be optimized sequentially by the multi-GPU optimizer. Consider setting simple_optimizer=True if this doesn't work for you.
(QMIX pid=477) 2022-01-30 19:24:03,729 INFO trainer.py:743 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
== Status ==
Current time: 2022-01-30 19:24:09 (running for 00:00:16.53)
Memory usage on this node: 29.2/125.8 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/28 CPUs, 0/0 GPUs, 0.0/63.5 GiB heap, 0.0/31.21 GiB objects
Result logdir: /mnt/irisgpfs/users/mruiz/MARLPdM2/results/my_exp
Number of trials: 1/1 (1 RUNNING)
+-----------------------------+----------+-----------------+
| Trial name | status | loc |
|-----------------------------+----------+-----------------|
| QMIX_FactoryEnv_c732e_00000 | RUNNING | 172.17.6.75:477 |
+-----------------------------+----------+-----------------+
(QMIX pid=477) 2022-01-30 19:24:09,826 WARNING util.py:57 -- Install gputil for GPU system monitoring.
(RolloutWorker pid=475) /home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/rllib/agents/qmix/qmix_policy.py:454: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:189.)
(RolloutWorker pid=475) k: torch.as_tensor(v, device=self.device)
2022-01-30 19:24:09,931 ERROR trial_runner.py:958 -- Trial QMIX_FactoryEnv_c732e_00000: Error processing event.
Traceback (most recent call last):
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 924, in _process_trial
results = self.trial_executor.fetch_result(trial)
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/tune/ray_trial_executor.py", line 787, in fetch_result
result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/worker.py", line 1713, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ValueError): ray::QMIX.train_buffered() (pid=477, ip=172.17.6.75, repr=QMIX)
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/tune/trainable.py", line 255, in train_buffered
result = self.train()
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/tune/trainable.py", line 314, in train
result = self.step()
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 880, in step
raise e
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 867, in step
result = self.step_attempt()
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 920, in step_attempt
step_results = next(self.train_exec_impl)
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/util/iter.py", line 756, in __next__
return next(self.built_iterator)
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/util/iter.py", line 843, in apply_filter
for item in it:
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/util/iter.py", line 843, in apply_filter
for item in it:
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/util/iter.py", line 843, in apply_filter
for item in it:
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/util/iter.py", line 1075, in build_union
item = next(it)
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/util/iter.py", line 756, in __next__
return next(self.built_iterator)
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
[Previous line repeated 1 more time]
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/util/iter.py", line 471, in base_iterator
yield ray.get(futures, timeout=timeout)
ray.exceptions.RayTaskError(ValueError): ray::RolloutWorker.par_iter_next() (pid=475, ip=172.17.6.75, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x2ab384047be0>)
ValueError: The two structures don't have the same nested structure.
First structure: type=ndarray str=[ 0 0 0 -1 -1 0 0 0 0 0]
Second structure: type=tuple str=(array([53, 37, 5, 7, 6, 53, 58, 52, 59, 52], dtype=int32),)
More specifically: Substructure "type=tuple str=(array([53, 37, 5, 7, 6, 53, 58, 52, 59, 52], dtype=int32),)" is a sequence, while substructure "type=ndarray str=[ 0 0 0 -1 -1 0 0 0 0 0]" is not
During handling of the above exception, another exception occurred:
ray::RolloutWorker.par_iter_next() (pid=475, ip=172.17.6.75, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x2ab384047be0>)
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/util/iter.py", line 1151, in par_iter_next
return next(self.local_it)
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 381, in gen_rollouts
yield self.sample()
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 757, in sample
batches = [self.input_reader.next()]
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 103, in next
batches = [self.get_data()]
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 265, in get_data
item = next(self._env_runner)
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 633, in _env_runner
_process_observations(
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 848, in _process_observations
prep_obs = preprocessor.transform(raw_obs)
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/rllib/models/preprocessors.py", line 236, in transform
self.check_shape(observation)
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/rllib/models/preprocessors.py", line 68, in check_shape
observation = convert_element_to_space_type(
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/ray/rllib/utils/spaces/space_utils.py", line 320, in convert_element_to_space_type
return tree.map_structure(
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/tree/__init__.py", line 507, in map_structure
assert_same_structure(structures[0], other, check_types=check_types)
File "/home/users/mruiz/.conda/envs/MARL/lib/python3.8/site-packages/tree/__init__.py", line 363, in assert_same_structure
raise type(e)("%s\n"
ValueError: The two structures don't have the same nested structure.
First structure: type=ndarray str=[ 0 0 0 -1 -1 0 0 0 0 0]
Second structure: type=tuple str=(array([53, 37, 5, 7, 6, 53, 58, 52, 59, 52], dtype=int32),)
More specifically: Substructure "type=tuple str=(array([53, 37, 5, 7, 6, 53, 58, 52, 59, 52], dtype=int32),)" is a sequence, while substructure "type=ndarray str=[ 0 0 0 -1 -1 0 0 0 0 0]" is not
Entire first structure:
.
Entire second structure:
(.,)
And I can not figure out why this substructure error