Hello,
I’ve implemented my simulator using the ExternalEnv API, i.e. my simulator-env queries the policy to obtain actions and so on (see code snippet).
@override(ExternalMultiAgentEnv)
def run(self):
obs = self.reset()
eid = self.start_episode()
while True:
action = self.get_action(eid, obs)
# action = {obs.agent_id: 101}
obs, reward, done, info = self.step(action)
self.log_returns(eid, reward, info)
if done:
self.end_episode(eid, obs)
obs = self.reset()
eid = self.start_episode()
My problem is that if the external simulator-env has sampled enough steps and RLlib starts with the first train step, then my external simulator-env crashes since get_action
throws an Empty exception triggered by a 60 seconds timeout (“queue empty”).
I guess the external simulator-env still queries the policy for an action while RLlib already has started with a train step. What can I do to prevent this problem?