Sample form the reply buffer sequentially in DQN

carlorop · June 22, 2022, 1:44pm

The original DQN algorithm samples the experiences randomly from the replay buffer, is there any way to sample the experiences in batches that conserve the sequentiallity for DQN or PPO?

arturn · June 27, 2022, 1:04pm

Yes, for DQN. PPO does not use a replay buffer.
Check out the current master branch and have a look at the replay_buffer_config attribute of the DQNConfig. You can set the storage_unit there to “sequences”.
This obviously has implecations for the rest of the algorithm, so you will want to choose sequencing parameters etc.

Master also features docs on the replay buffer topic.

Best

Topic		Replies	Views
Accessing the memory buffer dqn RLlib	10	995	January 16, 2022
Creating buffer for PPO RLlib	1	277	July 21, 2023
Custom sampling for dqn RLlib	1	220	June 13, 2023
Replay buffer - simple how-to question RLlib	2	295	October 7, 2021
Accessing DQN Memory Buffer from Ray object store memory for Restore RLlib	0	230	December 8, 2020

Sample form the reply buffer sequentially in DQN

Related topics