Multi-agent APPO with variable agent numbers and horizon

How severe does this issue affect your experience of using Ray?

  • Medium

Hi, I’m trying to use the multi-agent implementation of APPO in an environment that is

  • multi-agent
  • variable agent number
  • variable trajectory length

but am receiving some sort of masking error
File “/private/home/eugenevinitsky/.conda/envs/nocturned/lib/python3.8/site-packages/ray/rllib/agents/ppo/”, line 98, in reduce_mean_valid
(APPOTrainer pid=2792319) return torch.sum(t[mask]) / num_valid
(APPOTrainer pid=2792319) IndexError: The shape of the mask [4, 264] at index 0 does not match the shape of the indexed tensor [7, 264] at index 0

would anyone be able to explain the likely meaning of this error? It’s slightly hard to provide a reproducible example at the moment as the environment is private / custom