[RLlib] How to access the environment of 'remote worker environments'?

zalador · December 15, 2020, 5:13pm

Hello,

My training setup consists of 1 worker with multiple remote environments but I am having trouble with accessing the environment. Following the curriculum example on docs(dot)ray(dot)io, remote vector environments do not have a ‘env’ attribute directly. As far as I can tell, remote vector environments are grouped together as a list per worker. There is a function ‘get_unwrapped()’ in the BaseEnv class but that is not implemented for the remote environments. I tried accessing the ‘env’ attribute directly but the remote aspect of actors complicate things (at least for me).

Please see a simple reproduction script with the following error.

import ray, gym
from ray import tune
from ray.rllib.agents.ppo import DEFAULT_CONFIG as PPO_DEFAULT_CONFIG, PPOTrainer, PPOTFPolicy
from ray.rllib.agents.callbacks import DefaultCallbacks

class MyEnv(gym.Env):
    def __init__(self, env_config):
        self.action_space = gym.spaces.Box(0, 1, shape=(1,))
        self.observation_space = gym.spaces.Box(0, 1, shape=(2,))
        self.phase = 1
    def reset(self):
        self.steps = 0
        return self.observation_space.sample()
    def step(self, action):
        self.steps += 1
        return self.observation_space.sample(), 0, self.steps > 10, {}
    def set_phase(self, phase):
        print("Phase set: {}".format(phase))
        self.phase = phase

class MyCallbacks(DefaultCallbacks):
    def on_train_result(self, **kwargs):
        """ Curriculum learning as seen in Ray docs """
        result = kwargs["result"]
        if result["episode_reward_mean"] > 200:
            phase = 0
        else:
            phase = 1
        trainer = kwargs["trainer"]

        trainer.workers.foreach_worker(
            lambda ev: ev.foreach_env(
                lambda env: env.set_phase(phase))) # Problem when remote_worker_envs = True

config = PPO_DEFAULT_CONFIG.copy()
config["env"] = MyEnv
config["num_workers"] = 0
config["num_envs_per_worker"] = 2
config["remote_worker_envs"] = True # Culprit
config["callbacks"] = MyCallbacks

ray.init(num_gpus=1)
ray.tune.run(
    PPOTrainer,
    config=config,
)

AttributeError: 'RemoteVectorEnv' object has no attribute 'set_phase'
Instead, remote_worker_envs = False runs fine.

I tried accessing the env via the actors by:

    def on_train_result(self, **kwargs):
        """ Curriculum learning as seen in Ray docs """
        result = kwargs["result"]
        if result["episode_reward_mean"] > 200:
            phase = 0
        else:
            phase = 1
        trainer = kwargs["trainer"]

        def update_phase(env):
            for actor in env.actors:
                actor.env.set_phase(phase)

        trainer.workers.foreach_worker(
            lambda ev: ev.foreach_env(
                lambda env: update_phase(env)))

But this gives the following error:
AttributeError: 'ActorHandle' object has no attribute 'env'

The question is again: how do you access the individual environments?

Thanks in advance

SebastianBo1995 · July 12, 2021, 3:54pm

I am having the same question. I need to acess each environment of the remote workers after n ammount of episodes for debugging resons. Meaning, I need to manipulate the environment classes directly.
Is this somehow possible?

zalador · July 12, 2021, 6:43pm

Sadly I did not make any progress on this but I’m still interested in hearing if anyone does.

sven1977 · July 15, 2021, 10:13am

Hey @zalador @SebastianBo1995 , I created a PR to fix this limitation. Could you take a look here?

github.com/ray-project/ray

[RLlib] Discussion 247: Allow remote sub-envs (within vectorized) to be used with custom APIs.

ray-project:master ← sven1977:allow_remote_envs_to_be_used_with_set_task_api

opened 10:12AM - 15 Jul 21 UTC

sven1977

+268 -36

RLlib currently does not support using custom Env APIs when using the `remote_wo…rker_envs=True` setting (which parallelizes stepping through the different sub-envs within a vectorized env). This PR fixes this limitation and adds an example script to demonstrate how this can be done. Also see this discussion here: https://discuss.ray.io/t/rllib-how-to-access-the-environment-of-remote-worker-environments/247 ## Why are these changes needed? ## Related issue number ## Checks - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [x] Unit tests - [ ] Release tests - [ ] This PR is not tested :(

There is also an example script that verifies it’s working with RLlib’s set_task API (similar to your set_phase method, which should also work, of course).

zalador · July 16, 2021, 7:47pm

Hi,
I am unable to test this for the foreseeable future. Thank you for the fix, hopefully someone else can verify.

Topic		Replies	Views
Getting the env object from trainer on ray 1.13.0 RLlib	0	248	August 15, 2022
I cannot for the life of me figure out how to get a reference to my environment. Please Help RLlib	5	316	February 28, 2023
Only use Ray to vectorize environment RLlib	4	394	July 15, 2021
Rollout Worker Index with ExternalEnv RLlib	5	883	May 11, 2023
.get_env() in 2.0.0 for DQNTrainer RLlib	1	308	March 30, 2023

[RLlib] How to access the environment of 'remote worker environments'?

Related topics