Hi all, I want to concatenate all the sample batches in a replay buffer and use the experience to compute an update to a forward dynamics model. At the moment, I have an issue with
ConcatBatches where I would expect the
combine operator to give me a single
SampleBatch of the same size as the replay buffer. However, the size of the batches differs from the number of stores samples.
I also tried setting
min_batch_size to the size of the replay buffer (
local_replay_buffer.num_added). Surprisingly, this gives me
SampleBatches of size 32, even at ever increasing replay buffer size.
Could anyone explain the behaviour of
dynamics_op = (
Hi @sn73jq ,
Replay() yields a batch of size
local_replay_buffer will be a
MultiAgentReplayBuffer and by default,
replay_batch_size will be equal to
train_batch_size. Which is why you always sample batches of the same size from it!
That being said, the
min_batch_size argument of ConcatBatches should influence the size of the batches, as the name suggests. But I suspect that the code you are writing lives inside an execution plan. The execution plan method is only executed once on the beginning of your training. Therefore, when it is executed, the replay buffer is empty and your
min_batch_size becomes zero. This does not change later, because later on, not the execution plan method itself will be run, but the Iterator that is returned by it will be iterated over. This is not intuitive and will change in the near future. What it means is that the initial
min_batch_size argument (which is 0) for
ConcatBatches stays valid throughout your experiment.
If you really need batches comprising the whole buffer, for now, I think the easiest solution would be to modify the replay buffer code yourself and manually ignore the
replay_batch_size in the replay buffer, always simply concatenating all batches.
We are working on making things with replay buffers simpler.
I hope this helps!