I have to replicate the results of the original paper of MBMPO algorithm. I see there is a HalfCheetahWrapper environment for MBMPO but it seems to be using v2 which is not supported anymore. I want to know the correct version of the ray, gym/gymnasium, tensorflow, mujoco and mujoco_py that were used for producing the result. I have made a wrapper of my own but it does not produces progress.csv file and the results of training (min_reward, max_reward, etc) are all “nan”. Can someone please help.
I am unable to get the HalfCheetahWrapper to work with MBMPO algorithm
Could you provide a reproduction script for what you’re seeing?
The latest ray release
ray==2.4.0 should be compatible with
gymnasium=0.26.3, so you should be able to use
HalfCheetah-v4; you’ll have to modify the HalfCheetahWrapper since currently it uses HalfCheetah-v2.
Note that RLlib only supports MBMPO for torch, not tensorflow: Algorithms — Ray 2.4.0