RLLib PPO with centralized critic and LSTM (torch)

I have seen the example of how to create PPO with a centralized critic and it’s been useful, thanks!

I’m trying to adapt this codebase using Torch to use an LSTM or GRU instead of just a feed-forward network and I’ve been unsuccessful at that. :frowning_face:
Is there any such example or hint how to do use an LSTM as a central critic? (preferably with PPO)

Specifically I’m having trouble understanding how to properly set seq_lens but maybe I’ve just gone down the wrong rabbit hole and there are easier ways to do that.

Using RLLib 1.6

Thanks!