Hello,
My training setup consists of 1 worker with multiple remote environments but I am having trouble with accessing the environment. Following the curriculum example on docs(dot)ray(dot)io, remote vector environments do not have a ‘env’ attribute directly. As far as I can tell, remote vector environments are grouped together as a list per worker. There is a function ‘get_unwrapped()’ in the BaseEnv class but that is not implemented for the remote environments. I tried accessing the ‘env’ attribute directly but the remote aspect of actors complicate things (at least for me).
Please see a simple reproduction script with the following error.
import ray, gym
from ray import tune
from ray.rllib.agents.ppo import DEFAULT_CONFIG as PPO_DEFAULT_CONFIG, PPOTrainer, PPOTFPolicy
from ray.rllib.agents.callbacks import DefaultCallbacks
class MyEnv(gym.Env):
def __init__(self, env_config):
self.action_space = gym.spaces.Box(0, 1, shape=(1,))
self.observation_space = gym.spaces.Box(0, 1, shape=(2,))
self.phase = 1
def reset(self):
self.steps = 0
return self.observation_space.sample()
def step(self, action):
self.steps += 1
return self.observation_space.sample(), 0, self.steps > 10, {}
def set_phase(self, phase):
print("Phase set: {}".format(phase))
self.phase = phase
class MyCallbacks(DefaultCallbacks):
def on_train_result(self, **kwargs):
""" Curriculum learning as seen in Ray docs """
result = kwargs["result"]
if result["episode_reward_mean"] > 200:
phase = 0
else:
phase = 1
trainer = kwargs["trainer"]
trainer.workers.foreach_worker(
lambda ev: ev.foreach_env(
lambda env: env.set_phase(phase))) # Problem when remote_worker_envs = True
config = PPO_DEFAULT_CONFIG.copy()
config["env"] = MyEnv
config["num_workers"] = 0
config["num_envs_per_worker"] = 2
config["remote_worker_envs"] = True # Culprit
config["callbacks"] = MyCallbacks
ray.init(num_gpus=1)
ray.tune.run(
PPOTrainer,
config=config,
)
AttributeError: 'RemoteVectorEnv' object has no attribute 'set_phase'
Instead, remote_worker_envs = False
runs fine.
I tried accessing the env via the actors by:
def on_train_result(self, **kwargs):
""" Curriculum learning as seen in Ray docs """
result = kwargs["result"]
if result["episode_reward_mean"] > 200:
phase = 0
else:
phase = 1
trainer = kwargs["trainer"]
def update_phase(env):
for actor in env.actors:
actor.env.set_phase(phase)
trainer.workers.foreach_worker(
lambda ev: ev.foreach_env(
lambda env: update_phase(env)))
But this gives the following error:
AttributeError: 'ActorHandle' object has no attribute 'env'
The question is again: how do you access the individual environments?
Thanks in advance