Error when setting done=true: eval_data[i].env_id yields IndexError: list index out of range

Hey @nathanlct , yeah, there was a similar bug in RLlib that was fixed here:

I think this should fix your problem as well. Yes, it happened when a new(!) agent enters the episode and at the same time step, the episode terminates, such that this agent has an initial obs, but no action had to be calculated.