Does the agent train per episode or per iteration

carlorop · November 1, 2021, 11:56am

The parameter train_batch_size controls the size of the training batch. If train_batch_size=n, does it means that the agent start learning after n episodes or after n iterations (calls to the step() method of the environment)?

arturn · November 1, 2021, 7:57pm

Hi @carlorop

The timing of the first iteration of your optimization algorithm depends not only on the train_batch_size, but also on how often experiences are collected from rollout workers or how large the collected chunks are (see rollout_fragment_length).

Your Trainer instance starts training as soon as it has a minimum of train_batch_size experiences at hand. Where each experiences usually corresponds to one time step in your environment.

So the n in train_batch_size=n denotes steps in the environment and not episodes. Although your learner thread might start to learn later.

Cheers

Topic		Replies	Views
[RLlib] batch size interpretation when training multiple policies RLlib	4	610	July 15, 2021
Understanding train_batch_size in multiagent RL RLlib	0	361	November 22, 2021
Does training_iteration correspond to number of episodes? RLlib	1	1048	February 19, 2022
Needs help on understanding `buffer_size` and `train_batch_size` RLlib	4	1145	October 30, 2021
[RLlib] Batch size for complete_episodes issue RLlib	6	2125	February 3, 2022

Does the agent train per episode or per iteration

Related topics