I want to know what the ‘training iteration’ parameter stands for. It is related to train episode or train timesteps?
It is the number of model updates as far as I know.
What’s the connection between it and train episodes?
- Each “training iteration” corresponds to one call to Trainer.train().
- A timestep is a single action taken in the environment.
- “Train episodes” is the number of episodes completed to create the batch used for training. Note that for batch_mode=“truncate_episodes”, there may be incomplete episodes (that don’t have done=True at the end) inside that train batch.