External Env crashes during training step


I’ve implemented my simulator using the ExternalEnv API, i.e. my simulator-env queries the policy to obtain actions and so on (see code snippet).

def run(self):
    obs = self.reset()
    eid = self.start_episode()

    while True:
        action = self.get_action(eid, obs)
        # action = {obs.agent_id: 101}
        obs, reward, done, info = self.step(action)
        self.log_returns(eid, reward, info)
        if done:
            self.end_episode(eid, obs)
            obs = self.reset()
            eid = self.start_episode()

My problem is that if the external simulator-env has sampled enough steps and RLlib starts with the first train step, then my external simulator-env crashes since get_action throws an Empty exception triggered by a 60 seconds timeout (“queue empty”).

I guess the external simulator-env still queries the policy for an action while RLlib already has started with a train step. What can I do to prevent this problem?

1 Like

Hi @klausk55,

Are you using local_inference or remote_inference?

Hi @mannyv,

I don’t use any server resp. client (i.e. there is no PolicyClient class or something like this!).
I use the ExternalEnv API for a “custom use case” where my external simulator-env queries the policy to obtain an action or log rewards.
There are reset and step methods and the run loop exactly like in the code snippet above. And what the PPOTrainer gets is simply “env=MyExtEnvSimulator” (i.e. the registered class).

thanks for the awesome information.