PPO torch vs tf2


I was preferring to use tf2, because it is something I am more familiar with.
However, in my setup, ppo seems to run well, when I try running with tf2, every so often, things between the worker / environment stall.

I am guessing this is when the learner performs its learning.
Is there anyway to split the resource used for learning and worker (assuming that is the issue).

At the moment I have 16 CPU threads and 1 GPU core.

Wait for me to give a more detailed update on this because I’ve observed:
PPO torch vs TF: torch is fine, tf does the behaviour I mentioned earlier.
A3C: torch is fine, tf crashes.

I have changed from a 1650 Super 4GB GPU to a 3090 24GB, but no improvement.

I’m starting to wonder if it is a tensorflow version issue. I’m on 2.6, or the way I’m configuring resources / learners / rollout workers.

@NDR008 I think you might be on to something regarding the tensforflow version issue. We always run release tests on both torch and tf for PPO and they run fine.
