MultiAgent env wrong structures

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hi everyone!
I’m new using Ray, I chose it because I believe it’s a great option for making RL projects.
I’m now trying to train a multiagent enviroment, like simple_tag from pettingzoo, but I cant make it work.
I always got this error ‘substructures are different size or nested’ and when I look at them, one of them is an array of (for example) 16 elements, and the other one is a dict, with keys ‘agent_0’, ‘agent_1’ and an array of 16 elements for each key.
I think the first structure is the one that the enviroment is giving, and larger one, the one ray is expecting.

This is the config:
config = {
“env”: “simple_spread”,
“framework”: “torch”, # o “tf” si usas TensorFlow
“multiagent”: {
“policies”: {
“agent_0”: (None, env_creator({}).observation_space, env_creator({}).action_space, {}),
“agent_1”: (None, env_creator({}).observation_space, env_creator({}).action_space, {}),
“agent_2”: (None, env_creator({}).observation_space, env_creator({}).action_space, {}),
},
“policy_mapping_fn”: policy_mapping_fn,
},
“num_workers”:1,
}

What can I do? is that an enviroment problem?
Thanks!

I will give some more information:
More specifically: Substructure “type=OrderedDict str=OrderedDict([(‘adversary_0’, array([ 0.01255419, -0.6857402 , 3.5040925 , -0.5908824 , -0.28089884,
(PPO pid=13468) -0.42432716, -0.98887783, -0.8802794 , -0.559584 , -0.19044599,
(PPO pid=13468) -0.71634007, -0.06078899, -0.43675077, -0.15962653, 0.9411768 ,
(PPO pid=13468) -2.162692 ], dtype=float32)), (‘adversary_1’, array([-1.5311705 , 1.3489779 , 0.42647207, -1.2930255 , 0.34891367,
(PPO pid=13468) 0.35973787, 0.9409657 , 0.31296352, -1.4111371 , -0.66821265,
(PPO pid=13468) 1.8022188 , 0.07645752, 0.5687924 , -0.36956072, 0.8243999 ,
(PPO pid=13468) 0.09302095], dtype=float32)), (‘adversary_2’, array([ 1.3076917 , 1.1348339 , -0.41443637,
1.0638967 , 0.05962287,
(PPO pid=13468) -0.8662066 , -0.15366012, -1.1731756 , 0.46935457, -1.4058455 ,
(PPO pid=13468) 0.27503854, 0.27102682, -0.6621483 , -0.7810565 , -0.58224857,
(PPO pid=13468) 0.39142346], dtype=float32)), (‘agent_0’, array([ 0.27529287, 1.212923 , -1.0894041 , -0.28833956, -0.4597624 ,
(PPO pid=13468) 0.19036998, 1.4900975 , 0.06139584, 1.9097888 , 0.9566886 ,
(PPO pid=13468) -0.8215733 , 0.16532217, -1.4654043 , -0.7305452 ], dtype=float32))])” is a sequence, while substructure “type=ndarray str=[ 0. 0. -0.89107406 0.09816181 1.5538627 -0.97209406
(PPO pid=13468) 0.43329492 -0.93592656 0.16501105 -0.51356584 -0.07702168 -0.28925264
(PPO pid=13468) 0.70538473 0.14393562 0. 0. ]” is not