Error when setting done=true: eval_data[i].env_id yields IndexError: list index out of range

nathanlct · February 12, 2021, 10:50pm

Hello,

I have upgraded ray from 0.8.0 to 2.0.0.dev and am now getting this error while training in my multi-agent environment:

  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 1468, in _process_policy_eval_results
    env_id: int = eval_data[i].env_id
IndexError: list index out of range

Full stack

2021-02-12 23:14:51,675 ERROR trial_runner.py:708 -- Trial PPO_0_train_and_sgd_batch_sizes=1000: Error processing event.
Traceback (most recent call last):
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 678, in _process_trial
    results = self.trial_executor.fetch_result(trial)
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 597, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 47, in wrapper
    return func(*args, **kwargs)
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/worker.py", line 1458, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(IndexError): ray::PPO.train_buffered() (pid=81810, ip=172.20.10.3)
  File "python/ray/_raylet.pyx", line 486, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 432, in ray._raylet.execute_task.function_executor
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/tune/trainable.py", line 167, in train_buffered
    result = self.train()
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 535, in train
    raise e
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 524, in train
    result = Trainable.train(self)
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/tune/trainable.py", line 226, in train
    result = self.step()
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 148, in step
    res = next(self.train_exec_impl)
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 756, in __next__
    return next(self.built_iterator)
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 843, in apply_filter
    for item in it:
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 843, in apply_filter
    for item in it:
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  [Previous line repeated 1 more time]
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 876, in apply_flatten
    for item in it:
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 828, in add_wait_hooks
    item = next(it)
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  [Previous line repeated 1 more time]
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 471, in base_iterator
    yield ray.get(futures, timeout=timeout)
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 47, in wrapper
    return func(*args, **kwargs)
ray.exceptions.RayTaskError(IndexError): ray::RolloutWorker.par_iter_next() (pid=81809, ip=172.20.10.3)
  File "python/ray/_raylet.pyx", line 486, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 432, in ray._raylet.execute_task.function_executor
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/util/iter.py", line 1152, in par_iter_next
    return next(self.local_it)
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 327, in gen_rollouts
    yield self.sample()
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 678, in sample
    batches = [self.input_reader.next()]
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 98, in next
    batches = [self.get_data()]
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 232, in get_data
    item = next(self.rollout_provider)
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 694, in _env_runner
    sample_collector=sample_collector,
  File "/Users/nathan/opt/anaconda3/envs/cc/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 1468, in _process_policy_eval_results
    env_id: int = eval_data[i].env_id
IndexError: list index out of range

The value of no_done_at_end doesn’t seem to change much. The error happens when I set done[rl_id] = True for an agent. It is failing around here:

    actions: List[EnvActionType] = unbatch(actions)
    # type: int, EnvActionType
    for i, action in enumerate(actions):
        # Clip if necessary.
        if clip_actions:
            clipped_action = clip_action(action,
                                         policy.action_space_struct)
        else:
            clipped_action = action

        env_id: int = eval_data[i].env_id

It seems that the code has actions even for agents that are done, ie. len(actions) = len(eval_data) + n_dones where n_dones is the number of dones[rl_id] that I set to true in that iteration. Leading to the index error.

I have already spent quite some time trying to debug that so I figured I would ask here, in case it is something trivial that changed when upgrading ray.

Thanks!

Edit: I ended up setting config['no_done_at_end'] = True and removing all the dones[rl_id] = True that were in my code. Now the error still happens but much less frequently and doesn’t seem systematic, my environment sometimes has time to do several episodes/resets before it happens. But still always happens within the first 10 minutes of training, always right after a reset.

nathanlct · February 14, 2021, 12:02am

Alright, I finally figured it out.

If a new agent enters the environment at the very last step of the episode, it doesn’t have a last observation and sample collector adds an initial observation (line 1099 of rllib/evaluation/sampler.py). Thus the _add_to_next_inference_call function is called right before the reset and appends one element to self.forward_pass_agent_keys[pid] (in rllib/evaluation/collectors/simple_list_collector.py line ~775. Then the reset is called, and an initial observation is given to all the new agents, which results in self.forward_pass_agent_keys[pid] being of len n+1 when there’s only n agents in the environment, which then propagates into the error.

Not sure if I did something wrong with the agent’s dones or if it’s a bug though. As a temporary fix, I’m not adding the agent id to the states/rewards/dones/info data if we’re at the last step of the episode and it’s a new agent.

sven1977 · February 18, 2021, 1:09pm

Hey @nathanlct , yeah, there was a similar bug in RLlib that was fixed here:

I think this should fix your problem as well. Yes, it happened when a new(!) agent enters the episode and at the same time step, the episode terminates, such that this agent has an initial obs, but no action had to be calculated.

Topic		Replies	Views
Error: IndexError: list index out of range RLlib	1	294	March 22, 2024
BUG: Error: IndexError: list index out of range in env_runner_v2.py Configure Algorithm, Training, Evaluation, Scaling	0	133	April 24, 2024
IndexError in agent_collector RLlib	2	15	March 18, 2025
Assert agent_key not in self.agent_collectors RLlib	7	1356	October 7, 2021
Error: TypeError: 'EnvContext' object cannot be interpreted as an integer? RLlib	6	1782	February 19, 2021

Error when setting done=true: eval_data[i].env_id yields IndexError: list index out of range

Related topics