MARL Custom RNN Model Batch Shape (batch, seq, feature)

@CodingBurmer one comment that might be helpful. The first few passes through the model you are seeing are probably not real data. They are data being pushed through as part of the trajectory_view code to determine view requirements. From some examples I have run I usually see 4 forward passes with dummy data as part of the setup before the training process actually starts.

As sven1977 was saying, you could look at the centralized_critic_2 for an example of how you could share (rewrite) agent observations to include the observations of other agents during training. You need not actually use a centralized critic. Just take the sharing observations part of the example.

Do you have a sample repo or a minimal sample you could share.

1 Like