Toy example for using ExternalEnv API

Hello people,

Is my toy example a correct and possible way how one can use the ExternalEnv API?

test.py:

from typing import Dict
import ray
from ray.rllib.agents.ppo.ppo import PPOTrainer
from ray.tune.logger import pretty_print
from ray.tune.registry import register_env

from gym.spaces import Discrete, Box

from toy_external_env import ToyExternalEnv

def env_creator(config: Dict):
    action_space = Discrete(2)
    observation_space = Box(-10, 10, (4,))
    return ToyExternalEnv(action_space, observation_space)

if __name__ == "__main__":
    
    ray.init()

    register_env("TEE", env_creator)

    trainer = PPOTrainer(config= {"framework": "tf2"}, env="TEE")
    results = trainer.train()
    pretty_print(results)
    print("Training end")

toy_external_env.py:

from typing import Any, Tuple
from ray.rllib.env.external_env import ExternalEnv
from ray.rllib.utils.annotations import override

import gym
import numpy as np

class ToyExternalEnv(ExternalEnv):

    def __init__(self, action_space: gym.Space, observation_space: gym.Space,
                 max_concurrent: int = 100):
        super().__init__(action_space, observation_space,
                         max_concurrent=max_concurrent)

        # self.simulator = DES()
        self.simulator = gym.make("CartPole-v0")

    @override(ExternalEnv)
    def run(self):
        obs = self.simulator.reset()
        self.simulator.render()
        eid = self.start_episode()

        while True:
            action = self.get_action(eid, obs)
            obs, reward, done, info = self.simulator.step(action)
            self.simulator.render()
            self.log_returns(eid, reward, info)
            if done:
                self.end_episode(eid, obs)
                obs = self.simulator.reset()
                eid = self.start_episode()

Do all agree with me or are there any objections (other ideas) how one can use the ExternalEnv API?

Hello @klausk55 , I’m new in RLlib and I’m trying to run an ExternalEnv with Tune. Could you try the code here showed? Wer you find a better way to do that?
Tomorrow I will try the way that you expose anyway. Thanks!

Hey @hermmanhender,

my experiences in combination with Tune are almost zero, but yes this code example (toy example) here worked for me when I’d tried it with RLlib alone.
I can’t tell you whether there is a better way or not, perhaps there is one :sweat_smile:
I’d started to do my simulations with an ExternalEnv (in the same manner as above), but in the meantime I’ve switched to use a normal MultiagentEnv resp. GymEnv and integrated my discrete event simulation therein.