[RLlib] How to access the environment of 'remote worker environments'?

Hello,

My training setup consists of 1 worker with multiple remote environments but I am having trouble with accessing the environment. Following the curriculum example on docs(dot)ray(dot)io, remote vector environments do not have a ‘env’ attribute directly. As far as I can tell, remote vector environments are grouped together as a list per worker. There is a function ‘get_unwrapped()’ in the BaseEnv class but that is not implemented for the remote environments. I tried accessing the ‘env’ attribute directly but the remote aspect of actors complicate things (at least for me).

Please see a simple reproduction script with the following error.

import ray, gym
from ray import tune
from ray.rllib.agents.ppo import DEFAULT_CONFIG as PPO_DEFAULT_CONFIG, PPOTrainer, PPOTFPolicy
from ray.rllib.agents.callbacks import DefaultCallbacks

class MyEnv(gym.Env):
    def __init__(self, env_config):
        self.action_space = gym.spaces.Box(0, 1, shape=(1,))
        self.observation_space = gym.spaces.Box(0, 1, shape=(2,))
        self.phase = 1
    def reset(self):
        self.steps = 0
        return self.observation_space.sample()
    def step(self, action):
        self.steps += 1
        return self.observation_space.sample(), 0, self.steps > 10, {}
    def set_phase(self, phase):
        print("Phase set: {}".format(phase))
        self.phase = phase

class MyCallbacks(DefaultCallbacks):
    def on_train_result(self, **kwargs):
        """ Curriculum learning as seen in Ray docs """
        result = kwargs["result"]
        if result["episode_reward_mean"] > 200:
            phase = 0
        else:
            phase = 1
        trainer = kwargs["trainer"]

        trainer.workers.foreach_worker(
            lambda ev: ev.foreach_env(
                lambda env: env.set_phase(phase))) # Problem when remote_worker_envs = True

config = PPO_DEFAULT_CONFIG.copy()
config["env"] = MyEnv
config["num_workers"] = 0
config["num_envs_per_worker"] = 2
config["remote_worker_envs"] = True # Culprit
config["callbacks"] = MyCallbacks

ray.init(num_gpus=1)
ray.tune.run(
    PPOTrainer,
    config=config,
)

AttributeError: 'RemoteVectorEnv' object has no attribute 'set_phase'
Instead, remote_worker_envs = False runs fine.

I tried accessing the env via the actors by:

    def on_train_result(self, **kwargs):
        """ Curriculum learning as seen in Ray docs """
        result = kwargs["result"]
        if result["episode_reward_mean"] > 200:
            phase = 0
        else:
            phase = 1
        trainer = kwargs["trainer"]

        def update_phase(env):
            for actor in env.actors:
                actor.env.set_phase(phase)

        trainer.workers.foreach_worker(
            lambda ev: ev.foreach_env(
                lambda env: update_phase(env)))

But this gives the following error:
AttributeError: 'ActorHandle' object has no attribute 'env'

The question is again: how do you access the individual environments?

Thanks in advance

I am having the same question. I need to acess each environment of the remote workers after n ammount of episodes for debugging resons. Meaning, I need to manipulate the environment classes directly.
Is this somehow possible?

Sadly I did not make any progress on this but I’m still interested in hearing if anyone does.

Hey @zalador @SebastianBo1995 , I created a PR to fix this limitation. Could you take a look here?

There is also an example script that verifies it’s working with RLlib’s set_task API (similar to your set_phase method, which should also work, of course).

1 Like

Hi,
I am unable to test this for the foreseeable future. Thank you for the fix, hopefully someone else can verify.