How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hi all.
Second post, still learning. Thanks for you patience.
Same use case as in my first post on a similar topic here: Repeated in action space
Having implemented an ugly work-around to get past the issue with the variable number of actions in the action_space, I’m now facing a similar issue with the observation space, which looks like this:
self.observation_space = Dict({
agent_id: Dict({
"obs_1": Box(low=-200, high=+200, shape=(2,), dtype=np.float16),
"obs_2": Box(low=-18, high=+18, shape=(1,), dtype=np.int8),
"execution_log": Repeated(Dict({
"obs_3": Box(low=-200, high=+200, shape=(2,), dtype=np.float16),
"obs_4": Box(low=-18, high=+18, shape=(1,), dtype=np.int8)
}), max_len=300)
})
for agent_id in self.agents
})
I’m now seeing this error during training:
More specifically: The two structures don't have the same number of elements. First structure: type=list str=[ ... ] Second structure: type=list str=[ ... ]
Full stacktrace
(TunerInternal pid=55518) Trial task failed for trial APPO_leader_env-v0_cf93c_00000
(TunerInternal pid=55518) Traceback (most recent call last):
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future
(TunerInternal pid=55518) result = ray.get(future)
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 24, in auto_init_wrapper
(TunerInternal pid=55518) return fn(*args, **kwargs)
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
(TunerInternal pid=55518) return func(*args, **kwargs)
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/_private/worker.py", line 2493, in get
(TunerInternal pid=55518) raise value.as_instanceof_cause()
(TunerInternal pid=55518) ray.exceptions.RayTaskError(ValueError): ray::APPO.train() (pid=55736, ip=192.168.111.128, actor_id=a1691998a5637356538ae8da01000000, repr=APPO)
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 375, in train
(TunerInternal pid=55518) raise skipped from exception_cause(skipped)
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 372, in train
(TunerInternal pid=55518) result = self.step()
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 851, in step
(TunerInternal pid=55518) results, train_iter_ctx = self._run_one_training_iteration()
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 2835, in _run_one_training_iteration
(TunerInternal pid=55518) results = self.training_step()
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/appo/appo.py", line 372, in training_step
(TunerInternal pid=55518) train_results = super().training_step()
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/impala/impala.py", line 697, in training_step
(TunerInternal pid=55518) unprocessed_sample_batches = self.get_samples_from_workers(
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/impala/impala.py", line 907, in get_samples_from_workers
(TunerInternal pid=55518) ] = self.workers.fetch_ready_async_reqs(
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/evaluation/worker_set.py", line 788, in fetch_ready_async_reqs
(TunerInternal pid=55518) handle_remote_call_result_errors(remote_results, self._ignore_worker_failures)
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/evaluation/worker_set.py", line 76, in handle_remote_call_result_errors
(TunerInternal pid=55518) raise r.get()
(TunerInternal pid=55518) ray.exceptions.RayTaskError(ValueError): ray::RolloutWorker.apply() (pid=55821, ip=192.168.111.128, actor_id=c94a6f52b9feffe6106943ba01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f703ea1d6c0>)
(TunerInternal pid=55518) ValueError: The two structures don't have the same nested structure.
(TunerInternal pid=55518)
(TunerInternal pid=55518) First structure: type=dict str={ ... }
(TunerInternal pid=55518)
(TunerInternal pid=55518) Second structure: type=OrderedDict str=OrderedDict([( ... )])
(TunerInternal pid=55518)
(TunerInternal pid=55518) More specifically: The two structures don't have the same number of elements. First structure: type=list str=[ ... ] Second structure: type=list str=[ ... ]
First structure sample
First structure: type=list str=[
{'obs_3': array([-6.650e+01, 2.913e-02], dtype=float16), 'obs_4': array([6], dtype=int8)},
{'obs_3': array([-67.6 , -0.6377], dtype=float16), 'obs_4': array([6], dtype=int8)},
{'obs_3': array([-68.8 , -1.305], dtype=float16), 'obs_4': array([6], dtype=int8)},
{'obs_3': array([-69.94 , -1.971], dtype=float16), 'obs_4': array([6], dtype=int8)},
...
]
Second structure sample
Second structure: type=list str=[
OrderedDict([('obs_3', array([ -968.5, -1599. ], dtype=float16)), ('obs_4', array([16], dtype=int8))]),
OrderedDict([('obs_3', array([-1355., -650.], dtype=float16)), ('obs_4', array([9], dtype=int8))]),
OrderedDict([('obs_3', array([ -440.5, -1223. ], dtype=float16)), ('obs_4', array([3], dtype=int8))]),
OrderedDict([('obs_3', array([ -375.5, -2041. ], dtype=float16)), ('obs_4', array([-11], dtype=int8))]),
...
]
The 2 structures are identical, the only difference is the number of elements in the Repeated
space “execution_log” in the observation. First one has 124, second has 136. This is normal: the count will change depending on the given action.
It seems to be the same error as reported in this issue on GitHub:
[RLlib] Repeated space: The two structures don’t have the same number of elements
I’m really confused by the response when the ticket was closed:
the reason for this issue is that in your script the
example_obs
does not match the structure specified by the space of your environment. To test that this is really the case you can useenv.observation_space.sample()
orenv.reset()
to get a proper obs struct and then your script would work.
The structures do match, the only difference that I see is the number of elements. My understanding of the Repeated
space is that this is precisely the use-case that it was made for: variable length lists. Now it’s failing because the observations in the output are different lengths? Does that make sense?
I can’t find any real documentation, but the pydoc comment “Represents a variable-length list of child spaces.” and the name of the constructor parameter max_len
would seem to confirm my understanding of the semantics of the Repeated
space.
I am now going to switch to a fixed-length Tuple
space, and pad the unnecessary elements out with zero values. I don’t know what that is going to do to the learning process.
Thanks for any input you can give.
Best Regards,
Adam