Memory leak with PPO and GNN custom model

Hi, it’s the title says.

My environment is pretty fast and light. But when it comes to training, I run trainer.train() once - it takes 5m and it barely makes it, I run it again and the whole thing crashes.

I tested the model in a supervised training loop and it didn’t leak memory. It’s not the environment because I was generating data directly from it.
I tested the same code with a dummy model and it still crashed, so I’m convinced it’s something about RLLIB’s training loop.

I wonder if there’s a way to debug memory usage by Object. ray memory shows 0B use, I assume because I’m in local mode, and ray dashboard is unusable.

Here is the code with no local dependencies, should reproduce.

Here’s the error trace when it finally crashes:

Finally, rllib doesn’t seem to detect my gpu, even though cuda.is_available() == True.

Hi @TheExGenesis,

I solved my problem. I was using default hyperparameters and they were too much. It worked fine using these:

"rollout_fragment_length": 10,
"train_batch_size": 100,
"sgd_minibatch_size": 10,
 "num_sgd_iter": 3,