RLlib w/ Unity3d ML Agents PPO is slow

How severe does this issue affect your experience of using Ray?

  • Low: It annoys or frustrates me for a moment.

I have been developing a relatively simple environment in Unity3d using ML Agents. While ML Agents is pretty awesome, it does not support hyperparameter tuning well at all. In order to tune some hyperparameters (and in the future more easily plug in new algorithms) I decided to try out the RLlib ā†” Unity ML Agents integration.

I have noticed that when I run the 3DBall example with the default PPO configurations, the pure Unity version runs much faster than the RLlib version. Digging through the source code, it looks like both the RLlib wrapper unity3d_env.py and the ML Agent learn.py scripts both use the UnityEnvironment from mlagents_envs.environment, so Iā€™m curious as to what could be causing the slow down.

In wall clock time it takes about 10-15 minutes to train 3DBall in Unity and about an hour or more with RLlib. I tried messing with the hyperparameters, especially those dealing with the batch sizes, but did not see any real improvements.

Any suggestions as to what the issue might be?

Thanks,

Jesse