One other thing that might be relevant. The evaluation worker has a seperate environment from the training workers so if you are using a value in the environment to tracking the number of episodes then 1 episode would make sense because that is only counting the number of evaluation episodes.