Issue with custom environment

I have a reproduction in Issues reproducing stable-baselines3 PPO performance with rllib