RLLib PPO with centralized critic and LSTM (torch)

rantav · November 30, 2021, 7:16pm

I have seen the example of how to create PPO with a centralized critic and it’s been useful, thanks!

I’m trying to adapt this codebase using Torch to use an LSTM or GRU instead of just a feed-forward network and I’ve been unsuccessful at that.
Is there any such example or hint how to do use an LSTM as a central critic? (preferably with PPO)

Specifically I’m having trouble understanding how to properly set seq_lens but maybe I’ve just gone down the wrong rabbit hole and there are easier ways to do that.

Using RLLib 1.6

Thanks!

Topic		Replies	Views
'use_lstm' with centralized critic for PPO RLlib	0	359	April 3, 2022
Custom PyTorch model implementation for PPO training RLlib	1	378	July 23, 2023
Seperate networks for actor and critic in the ppo RLlib	2	782	April 14, 2022
PPO+LSTM consistently not working Configure Algorithm, Training, Evaluation, Scaling	1	197	April 11, 2025
How to integrate LSTM into CNN+PPO RLlib	3	40	April 25, 2025

RLLib PPO with centralized critic and LSTM (torch)

Related topics