Does the trajectory view API support multiagent environments? I’m currently hitting the following internal error from trainer.train():
File "/home/hex/anaconda3/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 327, in gen_rollouts
yield self.sample()
File "/home/hex/anaconda3/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 661, in sample
batches = [self.input_reader.next()]
File "/home/hex/anaconda3/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 94, in next
batches = [self.get_data()]
File "/home/hex/anaconda3/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 223, in get_data
item = next(self.rollout_provider)
File "/home/hex/anaconda3/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 669, in _env_runner
_process_policy_eval_results(
File "/home/hex/anaconda3/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 1446, in _process_policy_eval_results
env_id: int = eval_data[i].env_id
IndexError: list index out of range
As you can see below, I have one environment with multiple agents. The for loop in _process_policy_eval_results seems to assume one action per environment.
(Pdb) p actions
[{<class 'forge.blade.io.action.static.Attack'>: {<class 'forge.blade.io.action.static.Style'>: 2, <class 'forge.blade.io.action.static.Target'>: 57}, <class 'forge.blade.io.action.static.Move'>: {<class 'forge.blade.io.action.static.Direction'>: 1}}, {<class 'forge.blade.io.action.static.Attack'>: {<class 'forge.blade.io.action.static.Style'>: 0, <class 'forge.blade.io.action.static.Target'>: 46}, <class 'forge.blade.io.action.static.Move'>: {<class 'forge.blade.io.action.static.Direction'>: 1}}]
(Pdb) p eval_data
[PolicyEvalData(env_id=0, agent_id=198, obs=array([ 0., 0., 10., ..., 2., 24., 137.], dtype=float32), info={}, rnn_state=[array([[0., 0.]], dtype=float32), array([[0., 0.]], dtype=float32)], prev_action=None, prev_reward=0.0)]