Max_seq_len of LSTM and Attention Net

Hello everyone!
I want to train an CNN+LSTM and CNN+AttentionNet model separately. My questions are the followings:

1. How can I set the same number of past observations passed to an LSTM and to an Attention Net?

My best quess:

---------------------------------------------------------------------
Set LSTM config    -> max_seq_len=64
---------------------------------------------------------------------
Set Att.Net config -> attention_memory_training=64
                                    attention_memory_inference=64
---------------------------------------------------------------------
Should I set max_seq_len=64 for AttentionNet too? What max_seq_len affect on Attention Net?

2. How many prev action/reward passed to LSTM if on LSTM config I set ‘lstm_use_prev_action=True’ and ‘lstm_use_prev_reward=True’. How can I match these on Attention Net?

My best quess:

---------------------------------------------------------------------
on LSTM this will pass 64 past reward/action.
---------------------------------------------------------------------
on Att.Net I have to set -> attention_use_n_prev_actions=64
                                               attention_use_n_prev_rewards=64

Thank you in advance!

Hi @TothAron ,

  1. max_seq_len=64 is probably ok. max_seq_len sets the length of segments we gather from our sampling procedures. So that makes it the maximum coherent context you get during training, which obviously has an effect on training attention.
  2. For LSTM, you only use information that was produced on the previous step (state) and combine it with new information to produce a new state and output. So the corresponding value would always be something like “lstm_use_n_prev_actions=1”. We don’t provide this though.