Setting multi agent early exit from Custom Env

kia · May 7, 2021, 12:08pm

Hi,

I am using a custom env. Theoretically it allows for early exit of one agent.
When I set up the dones dict in the step function like: if white_done and black_done:
new_done = dict({“all”: True})
elif white_done and not black_done:
new_done = dict({“white”:done, “all”: False})
elif black_done and not white_done:
new_done = dict({“black”:done, “all”: False})
else:
new_done = dict({“all”: False})

I get an error: Batches sent to postprocessing must only contain steps from a single trajectory

When I pass in: if white_done and black_done:
new_done = dict({“all”: True})
elif white_done and not black_done:
new_done = dict({ “all”: False})
elif black_done and not white_done:
new_done = dict({ “all”: False})
else:
new_done = dict({“all”: False})

It works now. However RL Lib also expects an obs dict with both agent’s information; even though one agent may be done.

How can I resolve this ?

kia · May 7, 2021, 1:25pm

Furthermore, what is happening now, is that only agent White or agent Black gets ‘done’, per episode, rather than both.

mannyv · May 7, 2021, 5:09pm

Hi @kia,

This probably won’t solve it entirely but I think the intention is that every agent that has a value in one of the dictionaries will have a value in all of them. So when your agents are not done they should have an entry in the done dictionary {“white”:False, …}. It seems to be working without that so maybe it is not needed but that is how I do it. You will need to decide on a terminal observation and reward. I think once an agent has returned done it no longer needs to appear in the dictionary but I have not checked that.

kia · May 11, 2021, 10:06am

Out of the two snippets I provided, the latter works. I guess its just a difficult env to solve.

sven1977 · May 13, 2021, 4:23pm

Hey @kia , on a first glance, this seems like a bug. The first script also looks ok and should work.
Could you provide a small self-sufficient script with some dummy 2-player env that would behave such that the error occurs?

jiangpeng_li · April 15, 2024, 7:12am

Hi @kia,

I found this piece of information from the github’s issue very useful. You might want to have a look.

Cheers,

JP

Topic		Replies	Views
Multi-Agent cyclic games with paused agents RLlib	2	453	September 27, 2021
How you handle agents early exiting from the environment? RLlib	1	653	May 5, 2022
Interaction between env and policy in multi agent environment RLlib	4	373	November 27, 2021
How to Handle Agent Death In MultiAgent Scenarios RLlib	1	118	April 22, 2024
MultiAgentEnv reward and terminated / truncated RLlib	0	300	October 12, 2023

Setting multi agent early exit from Custom Env

Related topics