I will probably change the DQN policy & Trainer to support LSTM (or at least RNN).
Do you know major problems preventing that?
I am asking since DQN doesn’t currently support the RNN and LSTM options in RLLib. I would like to be sure that the is no serious known issue preventing that.
Hey @Maxime_Riche , great question!
We do have an R2D2 agent (since a few weeks ago) in the current master (and upcoming 1.3 release).
It’s basically a vanilla DQN that runs with LSTMs/RNNs by storing sequences (of len max_seq_len) in the buffer, then samples these sequences as-is from the buffer and runs them through the LSTM for learning updates. The paper we used is linked here: https://openreview.net/pdf?id=r1lyTjAqYX