At the pond of the episode I think all you should have to do is return a dictionary with a reward for each agent.
If you look at this code here, especially 774-779 you will see that when the env returns all_done =True, it will create an empty obs for each agent that is not in the final observation.
Personally, I would do it myself in my env so that the semantics were really clear but based on this code it should work fine either way.