i am trying to replicate the open spiel example present in rlib ray/rllib/env/wrappers/open_spiel.py at 5323e739b5a87f92059dd12ed7a04e32c4590509 · ray-project/ray · GitHub
what the example does is to be able to return a observation dictionary that only refers to the next agent that must perform a action.
I am able to run environment where they all agents perform actions at the same time, but if i try to replicate the example in my own code (ray 2.30) it fails due to a validation error
File "/home/massimo/rlc-infrastructure/.venv/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 192, in apply
raise e
File "/home/massimo/rlc-infrastructure/.venv/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 181, in apply
return func(self, *args, **kwargs)
File "/home/massimo/rlc-infrastructure/.venv/lib/python3.10/site-packages/ray/rllib/execution/rollout_ops.py", line 104, in <lambda>
else (lambda w: (w.sample(**random_action_kwargs), w.get_metrics()))
File "/home/massimo/rlc-infrastructure/.venv/lib/python3.10/site-packages/ray/rllib/env/multi_agent_env_runner.py", line 155, in sample
samples = self._sample_timesteps(
File "/home/massimo/rlc-infrastructure/.venv/lib/python3.10/site-packages/ray/rllib/env/multi_agent_env_runner.py", line 312, in _sample_timesteps
self._episode.add_env_step(
File "/home/massimo/rlc-infrastructure/.venv/lib/python3.10/site-packages/ray/rllib/env/multi_agent_episode.py", line 618, in add_env_step
sa_episode.add_env_step(
File "/home/massimo/rlc-infrastructure/.venv/lib/python3.10/site-packages/ray/rllib/env/single_agent_episode.py", line 450, in add_env_step
self.validate()
File "/home/massimo/rlc-infrastructure/.venv/lib/python3.10/site-packages/ray/rllib/env/single_agent_episode.py", line 484, in validate
assert len(v) == len(self.observations) - 1
Am i missing something? is there something else i need to do beside returning the dictionary containing only the relevant observations?