I am looking for a way to save and load replay buffer for off-policy methods. I am looking into this like a transfer learning approach. Please let me know.
There is currently no support for this. I am sure they would welcome a PR if you implement it. Then we could add it to the save_state methods.currently there are often performance drops when resuming training with off policy algorithms because even though the weights are restored for the policies, the replay buffer restarts empty.
Hey @axr8716 , yeah, what @mannyv said :), it’s not supported right now, but on our TODO list.
The problem is that the replay buffer objects currently only “sit” inside the Trainer’s execution plan function, so we have no reference to it from within the Trainer of Policies. We need to make these objects registrable (via some convention). This way, they would be accessible when saving/restoring a Trainer. Feel free to do a PR that fixes the problem. A simple POC that only works e.g. for SimpleQ would suffice to get this rolling.