Hey community,
I have a problem with the EnvCompatibility in RLlib.
In fact, to truncate my episodes when beyond a certain number of steps, I put this in my __init__.py
:
register(
id="sar-v0.1",
entry_point="sar.envs.sar_base:SarBase",
max_episode_steps=300
)
In my environment, the function step
returns self.obs, reward, done, False, info
. This works perfectly for the truncation, and when 300 steps are reached the episode stops.
The problem I have with the train
function in RLlib, that apparently does an “EnvCompatibility” check and results in such error:
2023-05-02 14:56:21,938 ERROR actor_manager.py:496 -- Ray error, taking actor 1 out of service. ray::RolloutWorker.apply() (pid=6410, ip=10.99.52.23, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f525f69c820>)
File "/data/users/dahmadoun/conda-envs/gym_env/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 183, in apply
raise e
File "/data/users/dahmadoun/conda-envs/gym_env/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 174, in apply
return func(self, *args, **kwargs)
File "/data/users/dahmadoun/conda-envs/gym_env/lib/python3.10/site-packages/ray/rllib/execution/rollout_ops.py", line 86, in <lambda>
lambda w: w.sample(), local_worker=False, healthy_only=True
File "/data/users/dahmadoun/conda-envs/gym_env/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 914, in sample
batches = [self.input_reader.next()]
File "/data/users/dahmadoun/conda-envs/gym_env/lib/python3.10/site-packages/ray/rllib/evaluation/sampler.py", line 92, in next
batches = [self.get_data()]
File "/data/users/dahmadoun/conda-envs/gym_env/lib/python3.10/site-packages/ray/rllib/evaluation/sampler.py", line 277, in get_data
item = next(self._env_runner)
File "/data/users/dahmadoun/conda-envs/gym_env/lib/python3.10/site-packages/ray/rllib/evaluation/env_runner_v2.py", line 323, in run
outputs = self.step()
File "/data/users/dahmadoun/conda-envs/gym_env/lib/python3.10/site-packages/ray/rllib/evaluation/env_runner_v2.py", line 379, in step
self._base_env.send_actions(actions_to_send)
File "/data/users/dahmadoun/conda-envs/gym_env/lib/python3.10/site-packages/ray/rllib/env/vector_env.py", line 464, in send_actions
) = self.vector_env.vector_step(action_vector)
File "/data/users/dahmadoun/conda-envs/gym_env/lib/python3.10/site-packages/ray/rllib/env/vector_env.py", line 360, in vector_step
raise e
File "/data/users/dahmadoun/conda-envs/gym_env/lib/python3.10/site-packages/ray/rllib/env/vector_env.py", line 353, in vector_step
results = self.envs[i].step(actions[i])
File "/data/users/dahmadoun/conda-envs/gym_env/lib/python3.10/site-packages/gymnasium/wrappers/compatibility.py", line 107, in step
obs, reward, done, info = self.env.step(action)
ValueError: too many values to unpack (expected 4)
Do you have any idea how I should tell the Trainer that my function step is already under the new Gymnasium API ?
Thank you