Repeated in observation space

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hi all.

Second post, still learning. Thanks for you patience.

Same use case as in my first post on a similar topic here: Repeated in action space

Having implemented an ugly work-around to get past the issue with the variable number of actions in the action_space, I’m now facing a similar issue with the observation space, which looks like this:

        self.observation_space = Dict({
            agent_id: Dict({
                "obs_1": Box(low=-200, high=+200, shape=(2,), dtype=np.float16),
                "obs_2": Box(low=-18, high=+18, shape=(1,), dtype=np.int8),
                "execution_log": Repeated(Dict({
                    "obs_3": Box(low=-200, high=+200, shape=(2,), dtype=np.float16),
                    "obs_4": Box(low=-18, high=+18, shape=(1,), dtype=np.int8)
                }), max_len=300)
            for agent_id in self.agents

I’m now seeing this error during training:

More specifically: The two structures don't have the same number of elements. First structure: type=list str=[ ... ] Second structure: type=list str=[ ... ]
Full stacktrace
(TunerInternal pid=55518) Trial task failed for trial APPO_leader_env-v0_cf93c_00000
(TunerInternal pid=55518) Traceback (most recent call last):
(TunerInternal pid=55518)   File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/air/execution/_internal/", line 110, in resolve_future
(TunerInternal pid=55518)     result = ray.get(future)
(TunerInternal pid=55518)   File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/_private/", line 24, in auto_init_wrapper
(TunerInternal pid=55518)     return fn(*args, **kwargs)
(TunerInternal pid=55518)   File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/_private/", line 103, in wrapper
(TunerInternal pid=55518)     return func(*args, **kwargs)
(TunerInternal pid=55518)   File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/_private/", line 2493, in get
(TunerInternal pid=55518)     raise value.as_instanceof_cause()
(TunerInternal pid=55518) ray.exceptions.RayTaskError(ValueError): ray::APPO.train() (pid=55736, ip=, actor_id=a1691998a5637356538ae8da01000000, repr=APPO)
(TunerInternal pid=55518)   File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/tune/trainable/", line 375, in train
(TunerInternal pid=55518)     raise skipped from exception_cause(skipped)
(TunerInternal pid=55518)   File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/tune/trainable/", line 372, in train
(TunerInternal pid=55518)     result = self.step()
(TunerInternal pid=55518)   File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/", line 851, in step
(TunerInternal pid=55518)     results, train_iter_ctx = self._run_one_training_iteration()
(TunerInternal pid=55518)   File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/", line 2835, in _run_one_training_iteration
(TunerInternal pid=55518)     results = self.training_step()
(TunerInternal pid=55518)   File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/appo/", line 372, in training_step
(TunerInternal pid=55518)     train_results = super().training_step()
(TunerInternal pid=55518)   File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/impala/", line 697, in training_step
(TunerInternal pid=55518)     unprocessed_sample_batches = self.get_samples_from_workers(
(TunerInternal pid=55518)   File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/algorithms/impala/", line 907, in get_samples_from_workers
(TunerInternal pid=55518)     ] = self.workers.fetch_ready_async_reqs(
(TunerInternal pid=55518)   File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/evaluation/", line 788, in fetch_ready_async_reqs
(TunerInternal pid=55518)     handle_remote_call_result_errors(remote_results, self._ignore_worker_failures)
(TunerInternal pid=55518)   File "/home/adamcc/leader/venv/lib/python3.10/site-packages/ray/rllib/evaluation/", line 76, in handle_remote_call_result_errors
(TunerInternal pid=55518)     raise r.get()
(TunerInternal pid=55518) ray.exceptions.RayTaskError(ValueError): ray::RolloutWorker.apply() (pid=55821, ip=, actor_id=c94a6f52b9feffe6106943ba01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f703ea1d6c0>)
(TunerInternal pid=55518) ValueError: The two structures don't have the same nested structure.
(TunerInternal pid=55518) 
(TunerInternal pid=55518) First structure: type=dict str={ ... }
(TunerInternal pid=55518) 
(TunerInternal pid=55518) Second structure: type=OrderedDict str=OrderedDict([( ... )])
(TunerInternal pid=55518) 
(TunerInternal pid=55518) More specifically: The two structures don't have the same number of elements. First structure: type=list str=[ ... ] Second structure: type=list str=[ ... ]
First structure sample
First structure: type=list str=[
	{'obs_3': array([-6.650e+01,  2.913e-02], dtype=float16), 'obs_4': array([6], dtype=int8)}, 
	{'obs_3': array([-67.6   ,  -0.6377], dtype=float16), 'obs_4': array([6], dtype=int8)}, 
	{'obs_3': array([-68.8  ,  -1.305], dtype=float16), 'obs_4': array([6], dtype=int8)}, 
	{'obs_3': array([-69.94 ,  -1.971], dtype=float16), 'obs_4': array([6], dtype=int8)}, 
Second structure sample
Second structure: type=list str=[
    OrderedDict([('obs_3', array([ -968.5, -1599. ], dtype=float16)), ('obs_4', array([16], dtype=int8))]), 
	OrderedDict([('obs_3', array([-1355.,  -650.], dtype=float16)), ('obs_4', array([9], dtype=int8))]), 
	OrderedDict([('obs_3', array([ -440.5, -1223. ], dtype=float16)), ('obs_4', array([3], dtype=int8))]), 
	OrderedDict([('obs_3', array([ -375.5, -2041. ], dtype=float16)), ('obs_4', array([-11], dtype=int8))]),

The 2 structures are identical, the only difference is the number of elements in the Repeated space “execution_log” in the observation. First one has 124, second has 136. This is normal: the count will change depending on the given action.

It seems to be the same error as reported in this issue on GitHub:
[RLlib] Repeated space: The two structures don’t have the same number of elements

I’m really confused by the response when the ticket was closed:

the reason for this issue is that in your script the example_obs does not match the structure specified by the space of your environment. To test that this is really the case you can use env.observation_space.sample() or env.reset() to get a proper obs struct and then your script would work.

The structures do match, the only difference that I see is the number of elements. My understanding of the Repeated space is that this is precisely the use-case that it was made for: variable length lists. Now it’s failing because the observations in the output are different lengths? Does that make sense?

I can’t find any real documentation, but the pydoc comment “Represents a variable-length list of child spaces.” and the name of the constructor parameter max_len would seem to confirm my understanding of the semantics of the Repeated space.

I am now going to switch to a fixed-length Tuple space, and pad the unnecessary elements out with zero values. I don’t know what that is going to do to the learning process.

Thanks for any input you can give.

Best Regards,