PPO with PyTorch backend slow on GPU for Ray 1.0

Hi, I was told that when using the Pytorch backend for PPO, on a GPU, the training batch is copied multiple times (number of minibatch updates) to the GPU, thereby slowing the learning process. I just wanted to know whether the latest version of RLlib has fixed the issue.

In PPO, doing minibatch passes over the training data occurs here: ray/train_ops.py at 3e010c5760c99be5a9940001f33db087c52eb8e7 · ray-project/ray · GitHub

Based on the code, it looks like the batch is already loaded onto the GPU.

Thank you, the latest version of Rllib does seem to have fixed that problem.

1 Like

Hey @psxz , yes, we fixed that problem in the latest master. Now, all torch algos use the unified multi-GPU exec. op (that was before only available for tf).
This change will be included in the upcoming 1.6 ray release. :slight_smile:

We measured an speed increase for PPO + 1GPU on Atari of roughly 35% because of this change.

Thank you, that will be quite helpful.