When I use the PPO algorithm to train in my own environment the creation of the task fails with the following error message (I deleted the Ip information):
[36mray::RolloutWorker.apply()[39m (pid=22978, actor_id=560bce4a2b5858d94db5183601000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f61293b2d30>) File "/data/one/gsx/miniconda3/envs/nasim/lib/python3.9/site-packages/ray/rllib/utils/actor_manager.py", line 181, in apply
if self.config.recreate_failed_workers:
AttributeError: 'RolloutWorker' object has no attribute 'config'
2024-01-26 14:16:04,892 ERROR actor_manager.py:506 -- Ray error, taking actor 2 out of service. [36mray::RolloutWorker.apply()[39m (pid=22986 actor_id=8568b58b12ab33d806945ecd01000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7fbe10e7c0d0>)
File "/data/one/gsx/miniconda3/envs/nasim/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 611, in <lambda>
lambda w: w.assert_healthy()
File "/data/one/gsx/miniconda3/envs/nasim/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 646, in assert_healthy
is_healthy = self.policy_map and self.input_reader and self.output_writer
AttributeError: 'RolloutWorker' object has no attribute 'policy_map'
But When I use env="CartPole-v1“
,it’s ok.
The source code is as follows:
config.environment(env=NASimEnv, env_config=env_kwargs)
# config.environment(env="CartPole-v1")
config.rollouts(num_rollout_workers=12)
config.resources(num_gpus=1,
num_cpus_per_worker=1,)
if base_config["no_tune"]:
algo = config.build()
# run manual training loop and print results after each iteration
for i in range(50):
result = algo.train()
algo.stop()
The environment related code is as follows:
def __init__(self,
config: EnvContext):
"""
Parameters
----------
scenario : Scenario
Scenario object, defining the properties of the environment
fully_obs : bool, optional
The observability mode of environment, if True then uses fully
observable mode, otherwise is partially observable (default=False)
flat_actions : bool, optional
If true then uses a flat action space, otherwise will uses a
parameterised action space (default=True).
flat_obs : bool, optional
If true then uses a 1D observation space, otherwise uses a 2D
observation space (default=True)
render_mode : str, optional
The render mode to use for the environment.
"""
scenario = config["scenario"]
self.name = scenario.name
self.scenario = scenario
self.fully_obs = config["fully_obs"]
self.flat_actions = config["flat_actions"]
self.flat_obs = config["flat_actions"]
self.render_mode = config["render_mode"]
self.network = Network(scenario)
self.current_state = State.generate_initial_state(self.network)
self._renderer = None
self.reset()
if self.flat_actions:
self.action_space = FlatActionSpace(self.scenario)
else:
self.action_space = ParameterisedActionSpace(self.scenario)
if self.flat_obs:
obs_shape = self.last_obs.shape_flat()
else:
obs_shape = self.last_obs.shape()
obs_low, obs_high = Observation.get_space_bounds(self.scenario)
self.observation_space = spaces.Box(
low=obs_low, high=obs_high, shape=obs_shape
)
self.steps = 0
The machine configuration version is as follows:
ray==2.9.0
ubuntu==18.04.1 x86_64
gymnasium==0.28.1
Thanks!