"episodes_this_iter" in New API Stack

In the new API stack, the equivalent of “episodes_this_iter” is env_runners/num_episodes, which reports the number of episodes completed in the current training iteration, summed across all EnvRunners. The metric env_runners/num_episodes_lifetime instead tracks the cumulative number of episodes completed over the entire training run, not just the current iteration. This is confirmed by the Ray RLlib documentation and the source code where per-iteration episode counts are logged and reported as NUM_EPISODES.

Would you like more detail or a code example?

Sources:

Hint: Mention @RunLLM in the post for followups.

1 Like