I would like to get some guidance on adding LSTM layers to a custom DQN, I current have a working system using the DQN (with Rainbow configuration) and my environment provides observations of shape (time, features)
hence data is already of the form (batch, time, feature)
. However I am not treating the time component well without RNN/LSTM.
The current threads on this are somewhat outdated (see Are there some blocking points in adding LSTM to DQN?, R2D2 algorithm)
- Is it still the case the DQN does not support this easily?
- Are there any current examples of this since the R2D2 one seems no longer available?
- I am thinking that since my environment actually returns a sequences of observations to include temporal component (
t_0, t_-1, ... t_-n
). Am I correct in thinking I can get away with adding some LSTM layers to the model without having to handlesequence_length
in the replay buffer?
Cheers,