Adding, or not, transitions to the replay buffer based on the state

Hi everyone,
I’m trying to apply RL to a real life setup. When starting a new episode, i want to apply the policy right away, but not store the first few transitions to avoid any transient effects. It would be a condition on position for example, and such an information is in the state. How can i modify my replay buffer add(), or step() method of my environment, to ignore some transitions ?

Thanks in advance

Hi,

You can modify the add() method of your custom replay buffer to conditionally store transitions based on the state information. Here’s an example of how you can create a custom replay buffer by extending the base ReplayBuffer class:

from ray.rllib.utils.replay_buffers.replay_buffer import ReplayBuffer

class CustomReplayBuffer(ReplayBuffer):
    def add(self, data):
        # Check the condition based on the state information
        if self.should_store_transition(data):
            super().add(data)

    def should_store_transition(self, data):
        # Implement your condition here based on the state information
        # For example, if the position is in the state, you can check if it meets your criteria
        return True  # or False based on your condition

Then, you can use this custom replay buffer in your RLlib configuration:

config = (
    DQNConfig()
    .environment("CartPole-v1")
    .framework(framework=args.framework)
    .rollouts(num_rollout_workers=4)
    .training(
        replay_buffer_config={"type": CustomReplayBuffer},
    )
)

This way, your custom replay buffer will only store the transitions that meet your specified condition based on the state information.