What does the 'training_iteration' parameter relate to in the RLlib?

I want to know what the ‘training iteration’ parameter stands for. It is related to train episode or train timesteps?

It is the number of model updates as far as I know.

What’s the connection between it and train episodes?

  • Each “training iteration” corresponds to one call to Trainer.train().
  • A timestep is a single action taken in the environment.
  • “Train episodes” is the number of episodes completed to create the batch used for training. Note that for batch_mode=“truncate_episodes”, there may be incomplete episodes (that don’t have done=True at the end) inside that train batch.
1 Like