EnvCompatibility problem on the number of parameters returned by step

I have a problem with the EnvCompatibility in RLlib.
In fact, to truncate my episodes when beyond a certain number of steps, I put this in my __init__.py:


In my environment, the function step returns self.obs, reward, done, False, info. This works perfectly for the truncation, and when 300 steps are reached the episode stops.

The problem I have with the train function in RLlib, that apparently does an “EnvCompatibility” check and results in such error:

Do you have any idea how I should tell the Trainer that my function step is already under the new Gymnasium API ?

Here are more details on what I understood about the problem.

To add the max_episode_steps and thus the truncation with the time limit wrapper, I had to add an element in the parameters the function step returns. And for the RLlib trainer, it supposes the environment is under the old API and considers that my step only returns 4 elements.

@Douae_Ahmadoun will ping the RLlib guys here @arturn

Hi! What version of RLlib are you using?
If you use the Ray >= 2.3, RLlib should complain iff your env is not abiding to the Gymnasium API.

Indeed, I use the 2.3 RLlib and normally my environment is compatible with the new Gymnasium API (here, it’s kind of the opposite, my step returns 5 elements as in the new API, but the error indicates that it was expecting only 4 as in the old Gym).

I tried downgrading RLlib to the 2.2.0 version, but I had other errors on the build function.

I found the error I had.
This is the function I was using to register my environment and use it in RLlib.

def env_creator(env_config):
    env = SarBase()
    return EnvCompatibility(SarBase(env_config))

Since it is already compatible with the new API, it was useless (even problematic) to wrap it with EnvCompatibility, that was expecting an old API format (4 elements to return by step for example instead of 5).

