Independent learning for more agents [PettingZoo waterworld_v4]

Hi,

I am new to this library, I would like to have multiple independent agents learning in the same environment.

The waterworld example here is close to what I want to achieve but only works for two agents (pursuers). How can I adjust it to work for 5 agents for example?

This is the error that I get when I increase the number of agents to 5, i.e. set n_pursuers = 5 and args.num_agents = 5:

(MultiAgentEnvRunner pid=140923) 2025-01-02 18:07:36,256        ERROR actor_manager.py:187 -- Worker exception caught during `apply()`: all input arrays must have the same shape
(MultiAgentEnvRunner pid=140923)   File "/home/pvalia01/miniconda3/envs/pettingzooenv2/lib/python3.11/site-packages/ray/rllib/utils/actor_manager.py", line 183, in apply
(MultiAgentEnvRunner pid=140923)     return func(self, *args, **kwargs)
(MultiAgentEnvRunner pid=140923)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(MultiAgentEnvRunner pid=140923)   File "/home/pvalia01/miniconda3/envs/pettingzooenv2/lib/python3.11/site-packages/ray/rllib/execution/rollout_ops.py", line 110, in <lambda>
(MultiAgentEnvRunner pid=140923)     else (lambda w: (w.sample(**random_action_kwargs), w.get_metrics()))
(MultiAgentEnvRunner pid=140923)                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(MultiAgentEnvRunner pid=140923)   File "/home/pvalia01/miniconda3/envs/pettingzooenv2/lib/python3.11/site-packages/ray/util/tracing/tracing_helper.py", line 467, in _resume_span
(MultiAgentEnvRunner pid=140923)     return method(self, *_args, **_kwargs)
(MultiAgentEnvRunner pid=140923)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(MultiAgentEnvRunner pid=140923)   File "/home/pvalia01/miniconda3/envs/pettingzooenv2/lib/python3.11/site-packages/ray/rllib/env/multi_agent_env_runner.py", line 180, in sample
(MultiAgentEnvRunner pid=140923)     samples = self._sample_timesteps(
(MultiAgentEnvRunner pid=140923)               ^^^^^^^^^^^^^^^^^^^^^^^
(MultiAgentEnvRunner pid=140923)   File "/home/pvalia01/miniconda3/envs/pettingzooenv2/lib/python3.11/site-packages/ray/util/tracing/tracing_helper.py", line 467, in _resume_span
(MultiAgentEnvRunner pid=140923)     return method(self, *_args, **_kwargs)
(MultiAgentEnvRunner pid=140923)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(MultiAgentEnvRunner pid=140923)   File "/home/pvalia01/miniconda3/envs/pettingzooenv2/lib/python3.11/site-packages/ray/rllib/env/multi_agent_env_runner.py", line 380, in _sample_timesteps
(MultiAgentEnvRunner pid=140923)     self._episode.finalize(drop_zero_len_single_agent_episodes=True)
(MultiAgentEnvRunner pid=140923)   File "/home/pvalia01/miniconda3/envs/pettingzooenv2/lib/python3.11/site-packages/ray/rllib/env/multi_agent_episode.py", line 794, in finalize
(MultiAgentEnvRunner pid=140923)     agent_eps.finalize()
(MultiAgentEnvRunner pid=140923)   File "/home/pvalia01/miniconda3/envs/pettingzooenv2/lib/python3.11/site-packages/ray/rllib/env/single_agent_episode.py", line 578, in finalize
(MultiAgentEnvRunner pid=140923)     self.actions.finalize()
(MultiAgentEnvRunner pid=140923)   File "/home/pvalia01/miniconda3/envs/pettingzooenv2/lib/python3.11/site-packages/ray/rllib/env/utils/infinite_lookback_buffer.py", line 161, in finalize
(MultiAgentEnvRunner pid=140923)     self.data = batch(self.data)
(MultiAgentEnvRunner pid=140923)                 ^^^^^^^^^^^^^^^^
(MultiAgentEnvRunner pid=140923)   File "/home/pvalia01/miniconda3/envs/pettingzooenv2/lib/python3.11/site-packages/ray/rllib/utils/spaces/space_utils.py", line 373, in batch
(MultiAgentEnvRunner pid=140923)     ret = tree.map_structure(lambda *s: np_func(s, axis=0), *list_of_structs)
(MultiAgentEnvRunner pid=140923)           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(MultiAgentEnvRunner pid=140923)   File "/home/pvalia01/miniconda3/envs/pettingzooenv2/lib/python3.11/site-packages/tree/__init__.py", line 435, in map_structure
(MultiAgentEnvRunner pid=140923)     [func(*args) for args in zip(*map(flatten, structures))])
(MultiAgentEnvRunner pid=140923)     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(MultiAgentEnvRunner pid=140923)   File "/home/pvalia01/miniconda3/envs/pettingzooenv2/lib/python3.11/site-packages/tree/__init__.py", line 435, in <listcomp>
(MultiAgentEnvRunner pid=140923)     [func(*args) for args in zip(*map(flatten, structures))])
(MultiAgentEnvRunner pid=140923)      ^^^^^^^^^^^
(MultiAgentEnvRunner pid=140923)   File "/home/pvalia01/miniconda3/envs/pettingzooenv2/lib/python3.11/site-packages/ray/rllib/utils/spaces/space_utils.py", line 373, in <lambda>
(MultiAgentEnvRunner pid=140923)     ret = tree.map_structure(lambda *s: np_func(s, axis=0), *list_of_structs)
(MultiAgentEnvRunner pid=140923)                                         ^^^^^^^^^^^^^^^^^^
(MultiAgentEnvRunner pid=140923)   File "/home/pvalia01/miniconda3/envs/pettingzooenv2/lib/python3.11/site-packages/numpy/_core/shape_base.py", line 460, in stack
(MultiAgentEnvRunner pid=140923)     raise ValueError('all input arrays must have the same shape')
(MultiAgentEnvRunner pid=140923) ValueError: all input arrays must have the same shape

Thank you.