Hi Lukas, hi smorad, I have a slightly different understanding of the two parameters.
rollout_fragment_length describes how many steps a rollout worker has to do before these experiences can be collected.
- Let’s say a rollout worker has zero experiences collected - it is just starting to act. If it completes an episode of m>n steps, where n is the
rollout_fragment_length. It will chop this episode into pieces of length n and they will be collected by the training algorithm. It then leaves the left over experiences to be added to it’s future experiences and repeats the process.
- In another scenario, the the
rollout_worker also starts with zero experiences, but his first episode is shorter than the
rollout_fragment_length. It must therefore collect more experiences before it can send a single fragment of experiences consisting of experiences from multiple episodes.
RLlib keeps track of what experiences belong to what episode for you. So later on we avoid feeding recurrent models experiences from different episodes if we do not want to do that.
max_seq_len is predominantly used in ray/rnn_sequencing.py at master · ray-project/ray · GitHub to chop sequences of size
max_seq_len. You need this especially for recurrent models. As far as I know, it can be ignored if your policy is stateless.
max_seq_len is larger than the largest fragment you have collected, RLlib will use the largest sequence length it can find in your batch - possibly
rollout_fragment_length itself, if your episodes are long enough. Or you happen to have a sequence that stems from two rollouts fragments if you are “lucky” and can use your maximum sequence length.
(2) RLlib looks for the largest sequence in the batch and chops the batch into pieces of that size.
(3) Again, if your episodes are long enough, your sequences will be of size
rollout_fragment_length. Otherwise they will be smaller.
The trajectories are cut into pieces and the recurrent model (or attention model?) receives sequences of at most
max_seq_len timesteps. But as far as I understand it, nothing goes to waste. Smaller chunks are just padded if they need to be.
@smorad So for my understanding, this affects memory usage. But it might also affect the performance of the model, because of the padding?