PER Buffer throws KeyError during training of SAC

araman5 · April 7, 2025, 2:20pm

1. Severity of the issue: (select one)
None: I’m just curious or want clarification.
Low: Annoying but doesn’t hinder my work.
[* ] Medium: Significantly affects my productivity but can find a workaround.
High: Completely blocks me.

2. Environment:

Ray version: 2.43.0
Python version: 3.12
OS: Ubuntu 22.04
Cloud/Infrastructure: N/A
Other libs/tools (if relevant): N/A

3. What happened vs. what you expected:

Expected: I am training an SAC agent on a custom environment (as shown in the code snippet below). I have put the algo.train() in a for loop for 5000 epochs of training.

SACConfig()
            #.environment("LunarLanderContinuous-v3")
            .environment(env = "LiveStreaming",
                          env_config=env_config,
                          disable_env_checking=True)
            #.env_runners(num_env_runners=5)
            .resources(num_gpus=1)
            .learners(num_learners=1,
                      num_gpus_per_learner=1,
                      )
            .framework("torch")
            .training(actor_lr = CONFIG['sac']['actor_lr'],
                      critic_lr = CONFIG['sac']['critic_lr'],
                      gamma = CONFIG['sac']['gamma'],
                      train_batch_size = CONFIG['sac']['batch_size'],
                      tau = CONFIG['sac']['tau'],
                      initial_alpha=0.2,
                      target_entropy='auto',
                      twin_q=True,
                      num_steps_sampled_before_learning_starts=180*500,
                      )
            .rl_module(model_config={
                "fcnet_hiddens": [512, 256, 128, 64],
                "fcnet_activation":"tanh"
            })
            .evaluation(evaluation_num_workers=1,
                        evaluation_interval=CONFIG['training']['evaluation_interval'])

Actual: What I see is that, after several variable number of epochs, the training exists with an error that points to somewhere in the prioritized_episode_replay_buffer.py code. The error is pasted below. Different runs of the training code results in the same KeyError of 2097151 but it pops up after different number of training epochs.

  File "/home/cc/workspace/project/train_rllib.py", line 237, in <module>
    main(CONFIG['general']['total_episodes'])
  File "/home/cc/workspace/project/train_rllib.py", line 223, in main
    train_result = algo.train()
                   ^^^^^^^^^^^^
  File "/home/cc/.pyenv/versions/project/lib/python3.12/site-packages/ray/tune/trainable/trainable.py", line 330, in train
    raise skipped from exception_cause(skipped)
  File "/home/cc/.pyenv/versions/livegabr/lib/python3.12/site-packages/ray/tune/trainable/trainable.py", line 327, in train
    result = self.step()
             ^^^^^^^^^^^
  File "/home/cc/.pyenv/versions/livegabr/lib/python3.12/site-packages/ray/rllib/algorithms/algorithm.py", line 964, in step
    train_results, train_iter_ctx = self._run_one_training_iteration()
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cc/.pyenv/versions/livegabr/lib/python3.12/site-packages/ray/rllib/algorithms/algorithm.py", line 2991, in _run_one_training_iteration
    training_step_return_value = self.training_step()
                                 ^^^^^^^^^^^^^^^^^^^^
  File "/home/cc/.pyenv/versions/livegabr/lib/python3.12/site-packages/ray/rllib/algorithms/dqn/dqn.py", line 630, in training_step
    return self._training_step_new_api_stack()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cc/.pyenv/versions/livegabr/lib/python3.12/site-packages/ray/rllib/algorithms/dqn/dqn.py", line 674, in _training_step_new_api_stack
    episodes = self.local_replay_buffer.sample(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cc/.pyenv/versions/livegabr/lib/python3.12/site-packages/ray/rllib/utils/replay_buffers/prioritized_episode_buffer.py", line 477, in sample
    index_triple = self._indices[self._tree_idx_to_sample_idx[idx]]
                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^
KeyError: 2097151
Traceback (most recent call last):

verdictlg · May 14, 2025, 7:20pm

Hi,

Facing the same issue, greatly appreciate any help.

Multiple issues having keyError at the same index slice - 2097151

Other issue with the same error - Error occuring repeatedly during training at random time steps

josephlee · May 14, 2025, 11:24pm

Did this work for you? I had the same error.

josephlee · May 15, 2025, 8:39am

Before that error occurred, my RAM memory suddenly increased rapidly and then my computer froze and I got that error.

Lars_Simon_Zehnder · May 15, 2025, 9:37am

@araman5 Thanks for raising this issue. This looks like a bug in the source code. Could you create an issue in github/ray-project/ray and link me here to it, please? Then I can take a look into it

araman5 · May 15, 2025, 7:40pm

Hi,

Yes, I shall first test it out with the latest version of Ray (If there has been an update since when I posted this) before raising an issue in github/ray-project/ray

verdictlg · May 22, 2025, 6:05am

@araman5 I could recreate the issue with stable release, are you still facing this issue?

Topic		Replies	Views
Unexpected KeyError while training SAC Configure Algorithm, Training, Evaluation, Scaling	0	26	June 30, 2025
Error occuring repeatedly during training at random time steps RLlib	3	69	January 3, 2025
Prioritized Experience Buffer in Ray2.40 RLlib	0	19	March 3, 2025
SAC trainer slows down drastically RLlib	6	670	May 29, 2022
SAC terminates early in the training RLlib	2	275	October 27, 2021

PER Buffer throws KeyError during training of SAC

Related topics