I am unable to get the HalfCheetahWrapper to work with MBMPO algorithm

I have to replicate the results of the original paper of MBMPO algorithm. I see there is a HalfCheetahWrapper environment for MBMPO but it seems to be using v2 which is not supported anymore. I want to know the correct version of the ray, gym/gymnasium, tensorflow, mujoco and mujoco_py that were used for producing the result. I have made a wrapper of my own but it does not produces progress.csv file and the results of training (min_reward, max_reward, etc) are all “nan”. Can someone please help.

Could you provide a reproduction script for what you’re seeing?

The latest ray release ray==2.4.0 should be compatible with gymnasium=0.26.3, so you should be able to use HalfCheetah-v4; you’ll have to modify the HalfCheetahWrapper since currently it uses HalfCheetah-v2.

Note that RLlib only supports MBMPO for torch, not tensorflow: Algorithms — Ray 2.4.0