Multi-Agent cyclic games with paused agents

Hi @Aceticia,

At the pond of the episode I think all you should have to do is return a dictionary with a reward for each agent.

If you look at this code here, especially 774-779 you will see that when the env returns all_done =True, it will create an empty obs for each agent that is not in the final observation.

Personally, I would do it myself in my env so that the semantics were really clear but based on this code it should work fine either way.

@sven1977 or @gjoliver can you confirm?


1 Like