How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hi all.
Second post, still learning. Thanks for you patience.
Same use case as in my first post on a similar topic here: Repeated in action space
Having implemented an ugly work-around to get past the issue with the variable number of actions in the action_space, I’m now facing a similar issue with the observation space, which looks like this:
self.observation_space = Dict({
agent_id: Dict({
"obs_1": Box(low=-200, high=+200, shape=(2,), dtype=np.float16),
"obs_2": Box(low=-18, high=+18, shape=(1,), dtype=np.int8),
"execution_log": Repeated(Dict({
"obs_3": Box(low=-200, high=+200, shape=(2,), dtype=np.float16),
"obs_4": Box(low=-18, high=+18, shape=(1,), dtype=np.int8)
}), max_len=300)
for agent_id in self.agents
I’m now seeing this error during training:
More specifically: The two structures don't have the same number of elements. First structure: type=list str=[ ... ] Second structure: type=list str=[ ... ]
Full stacktrace
(TunerInternal pid=55518) Trial task failed for trial APPO_leader_env-v0_cf93c_00000
(TunerInternal pid=55518) Traceback (most recent call last):
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/air/execution/_internal/", line 110, in resolve_future
(TunerInternal pid=55518) result = ray.get(future)
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/_private/", line 24, in auto_init_wrapper
(TunerInternal pid=55518) return fn(*args, **kwargs)
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/_private/", line 103, in wrapper
(TunerInternal pid=55518) return func(*args, **kwargs)
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/_private/", line 2493, in get
(TunerInternal pid=55518) raise value.as_instanceof_cause()
(TunerInternal pid=55518) ray.exceptions.RayTaskError(ValueError): ray::APPO.train() (pid=55736, ip=, actor_id=a1691998a5637356538ae8da01000000, repr=APPO)
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/tune/trainable/", line 375, in train
(TunerInternal pid=55518) raise skipped from exception_cause(skipped)
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/tune/trainable/", line 372, in train
(TunerInternal pid=55518) result = self.step()
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/", line 851, in step
(TunerInternal pid=55518) results, train_iter_ctx = self._run_one_training_iteration()
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/", line 2835, in _run_one_training_iteration
(TunerInternal pid=55518) results = self.training_step()
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/appo/", line 372, in training_step
(TunerInternal pid=55518) train_results = super().training_step()
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/impala/", line 697, in training_step
(TunerInternal pid=55518) unprocessed_sample_batches = self.get_samples_from_workers(
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/impala/", line 907, in get_samples_from_workers
(TunerInternal pid=55518) ] = self.workers.fetch_ready_async_reqs(
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/evaluation/", line 788, in fetch_ready_async_reqs
(TunerInternal pid=55518) handle_remote_call_result_errors(remote_results, self._ignore_worker_failures)
(TunerInternal pid=55518) File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/evaluation/", line 76, in handle_remote_call_result_errors
(TunerInternal pid=55518) raise r.get()
(TunerInternal pid=55518) ray.exceptions.RayTaskError(ValueError): ray::RolloutWorker.apply() (pid=55821, ip=, actor_id=c94a6f52b9feffe6106943ba01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f703ea1d6c0>)
(TunerInternal pid=55518) ValueError: The two structures don't have the same nested structure.
(TunerInternal pid=55518)
(TunerInternal pid=55518) First structure: type=dict str={ ... }
(TunerInternal pid=55518)
(TunerInternal pid=55518) Second structure: type=OrderedDict str=OrderedDict([( ... )])
(TunerInternal pid=55518)
(TunerInternal pid=55518) More specifically: The two structures don't have the same number of elements. First structure: type=list str=[ ... ] Second structure: type=list str=[ ... ]
First structure sample
First structure: type=list str=[
{'obs_3': array([-6.650e+01, 2.913e-02], dtype=float16), 'obs_4': array([6], dtype=int8)},
{'obs_3': array([-67.6 , -0.6377], dtype=float16), 'obs_4': array([6], dtype=int8)},
{'obs_3': array([-68.8 , -1.305], dtype=float16), 'obs_4': array([6], dtype=int8)},
{'obs_3': array([-69.94 , -1.971], dtype=float16), 'obs_4': array([6], dtype=int8)},
Second structure sample
Second structure: type=list str=[
OrderedDict([('obs_3', array([ -968.5, -1599. ], dtype=float16)), ('obs_4', array([16], dtype=int8))]),
OrderedDict([('obs_3', array([-1355., -650.], dtype=float16)), ('obs_4', array([9], dtype=int8))]),
OrderedDict([('obs_3', array([ -440.5, -1223. ], dtype=float16)), ('obs_4', array([3], dtype=int8))]),
OrderedDict([('obs_3', array([ -375.5, -2041. ], dtype=float16)), ('obs_4', array([-11], dtype=int8))]),
The 2 structures are identical, the only difference is the number of elements in the Repeated
space “execution_log” in the observation. First one has 124, second has 136. This is normal: the count will change depending on the given action.
It seems to be the same error as reported in this issue on GitHub:
[RLlib] Repeated space: The two structures don’t have the same number of elements
I’m really confused by the response when the ticket was closed:
the reason for this issue is that in your script the
does not match the structure specified by the space of your environment. To test that this is really the case you can useenv.observation_space.sample()
to get a proper obs struct and then your script would work.
The structures do match, the only difference that I see is the number of elements. My understanding of the Repeated
space is that this is precisely the use-case that it was made for: variable length lists. Now it’s failing because the observations in the output are different lengths? Does that make sense?
I can’t find any real documentation, but the pydoc comment “Represents a variable-length list of child spaces.” and the name of the constructor parameter max_len
would seem to confirm my understanding of the semantics of the Repeated
I am now going to switch to a fixed-length Tuple
space, and pad the unnecessary elements out with zero values. I don’t know what that is going to do to the learning process.
Thanks for any input you can give.
Best Regards,