How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
I have been able to use the example that describe how to run a masked PPO agent. ray/rllib/examples/rl_module/action_masking_rlm.py at 9693fa855f9a9c2a738b2f26b294eb17282f43df · ray-project/ray · GitHub
and i used ray/rllib/examples/multi_agent_and_self_play/self_play_league_based_with_open_spiel.py at master · ray-project/ray · GitHub
to create a multiagent environment.
I have encountered the following issue:.
i have not been able to replicate the turn based nature of the open spiel example. In that example is possible to return a observation dictionary that contains the key of the current player only. If i attempt to do the same, i get a crash when episodes are fetched from the MultiAgentEpisode datastructure. It seems to me that it is because it assumes every player takes actions simultaneusly.
File "/home/massimo/Documents/ray/example.py", line 481, in <module>
model.train()
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 334, in train
raise skipped from exception_cause(skipped)
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 331, in train
result = self.step()
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 849, in step
results, train_iter_ctx = self._run_one_training_iteration()
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 3194, in _run_one_training_iteration
results = self.training_step()
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/ppo.py", line 406, in training_step
return self._training_step_new_api_stack()
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/ppo.py", line 418, in _training_step_new_api_stack
episodes = synchronous_parallel_sample(
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/execution/rollout_ops.py", line 88, in synchronous_parallel_sample
sampled_data = worker_set.foreach_worker(
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/worker_set.py", line 771, in foreach_worker
handle_remote_call_result_errors(remote_results, self._ignore_worker_failures)
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/evaluation/worker_set.py", line 78, in handle_remote_call_result_errors
raise r.get()
ray.exceptions.RayTaskError(IndexError): ray::MultiAgentEnvRunner.apply() (pid=741129, ip=192.168.1.29, actor_id=3f1365f5963cafdf655d67ff01000000, repr=<ray.rllib.env.multi_agent_env_runner.MultiAgentEnvRunner object at 0x72ebc01e0850>)
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 189, in apply
raise e
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 178, in apply
return func(self, *args, **kwargs)
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/execution/rollout_ops.py", line 89, in <lambda>
lambda w: w.sample(), local_worker=False, healthy_only=True
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/env/multi_agent_env_runner.py", line 137, in sample
return self._sample_timesteps(
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/env/multi_agent_env_runner.py", line 227, in _sample_timesteps
to_module = self._cached_to_module or self._env_to_module(
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/connectors/env_to_module/env_to_module_pipeline.py", line 25, in __call__
return super().__call__(
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/connectors/connector_pipeline_v2.py", line 68, in __call__
data = connector(
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/connectors/common/add_observations_from_episodes_to_batch.py", line 111, in __call__
for sa_episode in self.single_agent_episode_iterator(
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/connectors/connector_v2.py", line 278, in single_agent_episode_iterator
episode.get_agents_that_stepped()
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/env/multi_agent_episode.py", line 1423, in get_agents_that_stepped
return set(self.get_observations(-1).keys())
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/env/multi_agent_episode.py", line 935, in get_observations
return self._get(
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/env/multi_agent_episode.py", line 1674, in _get
return self._get_data_by_env_steps(**kwargs)
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/env/multi_agent_episode.py", line 1880, in _get_data_by_env_steps
agent_indices = self.env_t_to_agent_t[agent_id].get(
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/env/utils/infinite_lookback_buffer.py", line 158, in get
data = self._get_int_index(
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/env/utils/infinite_lookback_buffer.py", line 468, in _get_int_index
raise e
File "/home/massimo/Documents/ray/.venv/lib/python3.10/site-packages/ray/rllib/env/utils/infinite_lookback_buffer.py", line 455, in _get_int_index
data = data_to_use[idx]
IndexError: list index out of range
If i let the actors execute actions outside of their turns, and then i just ignore those that should not currently take a turn, it does fix this issue. Is there a proper way to address this?