You can get this infinite loop while working with R2D2 (with DQN+LSTM), and when forgetting to change the trainer from DQNTrainer to R2D2Trainer, then you can get an infinite loop inside SampleBatch._get_slice_indices
Debugging that is a bit painful.
In my debugging setup, I had: slice_size = 1 and self.seq_lens containing 2s.
Maybe this assert would work:
assert self.seq_lens[idx] < slice_size
class SampleBatch(dict):
[...]
def _get_slice_indices(self, slice_size):
i = 0
slices = []
if self.seq_lens is not None and len(self.seq_lens) > 0:
start_pos = 0
current_slize_size = 0
idx = 0
while idx < len(self.seq_lens):
seq_len = self.seq_lens[idx]
current_slize_size += seq_len
# Complete minibatch -> Append to slices.
if current_slize_size >= slice_size:
slices.append((start_pos, start_pos + slice_size))
start_pos += slice_size
if current_slize_size > slice_size:
overhead = current_slize_size - slice_size
start_pos -= seq_len - overhead
idx -= 1
current_slize_size = 0
idx += 1
else:
while i < self.count:
slices.append((i, i + slice_size))
i += slice_size
return slices