Cannot concat data under key 'advantages'

Edit : Fixed

Hello,
While training A2C on my custom env, I come across this error

ValueError: Cannot concat data under key ‘advantages’, b/c sub-structures under that key don’t match. samples=[SampleBatch(20: [‘obs’, ‘new_obs’, ‘actions’, ‘rewards’, ‘terminateds’, ‘truncateds’, ‘infos’, ‘eps_id’, ‘unroll_id’, ‘agent_index’, ‘t’, ‘vf_preds’, ‘values_bootstrapped’, ‘advantages’, ‘value_targets’]), SampleBatch(12: [‘obs’, ‘new_obs’, ‘actions’, ‘rewards’, ‘terminateds’, ‘truncateds’, ‘infos’, ‘eps_id’, ‘unroll_id’, ‘agent_index’, ‘t’, ‘vf_preds’, ‘values_bootstrapped’, ‘advantages’, ‘value_targets’])]
Original error:
all the input array dimensions except for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 20 and the array at index 1 has size 12

I have looked closely into my custom env but can not find what is wrong.
The only topic related to this was about malformed outputs from the env. So I have included some sample interactions from the env :

env.reset()

({‘agent0’: array([0.78885073, 0.68205817, 0.71528072, 0.92565924, 0.00974081,
1.84962887, 1.12484928, 1.67803646, 1.34944875, 1.7293651 ]),
‘agent1’: array([0.78885073, 0.68205817, 0.71528072, 0.92565924, 0.00974081,
1.84962887, 1.12484928, 1.67803646, 1.34944875, 1.7293651 ])},
{‘agent0’: {}, ‘agent1’: {}})

During episode :

({‘agent0’: array([0.72995571, 1.50591078, 0.84888693, 1.65788078, 0.87479644,
0.09938497, 2. , 2. , 1.4 , 1.4 ]),
‘agent1’: array([0.72995571, 1.50591078, 0.84888693, 1.65788078, 0.87479644,
0.09938497, 2. , 2. , 1.4 , 1.4 ])},
{‘agent0’: 0.19132183, ‘agent1’: 0.19132183},
{‘agent0’: False, ‘agent1’: False, ‘all’: False},
{‘agent0’: False, ‘agent1’: False, ‘all’: False},
{‘agent0’: {}, ‘agent1’: {}})

At episode end :

({‘agent0’: array([0.84888693, 1.65788078, 0.87479644, 0.09938497, 2. ,
2. , 1.4 , 1.4 , 2. , 2. ]),
‘agent1’: array([0.84888693, 1.65788078, 0.87479644, 0.09938497, 2. ,
2. , 1.4 , 1.4 , 2. , 2. ])},
{‘agent0’: 0.33333334, ‘agent1’: 0.33333334},
{‘agent0’: True, ‘agent1’: True, ‘all’: True},
{‘agent0’: False, ‘agent1’: False, ‘all’: True},
{‘agent0’: {}, ‘agent1’: {}})

Any insights would be highly appreciated.
Thank you.

There was in fact an issue with the rewards shape.
flattening them fixed the issue.

1 Like