What is the purpose of AddOneTsToEpisodesAndTruncate?

MCW_Lad · August 13, 2025, 1:28am

1. Severity of the issue: (select one)
None: I’m just curious or want clarification.

I’ve been working on a project that entailed going into episode trajectories and moving some things around to better serve a specialized shared critic. I noticed that each episode had an extra timestep at the end, and identified the AddOneTsToEpisodesAndTruncate connector attached in the PPOLearner class as the reason why.

Now, in that file, it says that this is necessary for VF bootstrapping, but I looked at compute_value_targets, which handles that calculation, and it seems like that already gets handled by the mask applied by continues. On the step where an episode terminates, the ‘next value’ is already completely ignored, and an extra zero is appended to the value list to facilitate this for the very last step.

Am I missing something, here? Was this connector used to make an earlier implementation of PPO work?

I should note that my implementation removes it, and trajectory calculations seem to work just fine.

Topic		Replies	Views
PPO bug: States' values aren't counted if the next action terminates the episode? RLlib	4	63	October 15, 2025
How valuable are bootstraped values for PPO training RLlib	2	45	June 17, 2025
Setting terminated and truncated at episode end Configure Algorithm, Training, Evaluation, Scaling	1	884	August 24, 2023
Is there a mix between truncate_episodes and complete_episodes? RLlib	0	248	July 20, 2022
Delayed Learning Due To Long Episode Lengths RLlib	9	1387	September 10, 2021

What is the purpose of AddOneTsToEpisodesAndTruncate?

Related topics