Is the NUM_ENV_STEPS_TRAINED logged incorrectly, if not how to interpret it compared to NUM_MODULE_STEPS_TRAINED?

Daraan · June 7, 2025, 3:24pm

I realized that learners/__all_modules__/num_env_steps_trained has a strangely large value, multiple times higher than learners/__all_modules__/num_module_steps_trained.

For example:

I have a batch size and sample size of 2048, train for 20 epochs with a minibatch size of 128. The resulting learners log is:

__all_modules__: {
  num_module_steps_trained: 40960,
  num_env_steps_trained: 655360
}

num_module_steps_trained makes sense to me it is:
num_module_steps_trained = 2048 samples * 20 epochs = 128 minibatch_size * (2048/128 minibatch cycles) * 20 epochs.

However, num_env_steps_trained makes no sense to me - it is 16 times higher. It is calculated:
2048 samples * 20 epochs * (2048 / 128 minibatch cycles) = 2048 * 320 iterations total

I assume this is a bug from the logging:

github.com/ray-project/ray

rllib/core/learner/learner.py

ca4878135


      
          # Log env steps (all modules).
          self.metrics.log_value(
              (ALL_MODULES, NUM_ENV_STEPS_TRAINED),
              batch.env_steps(),
              reduce="sum",
              clear_on_reduce=True,
          )

batch.env_steps(), despite being a minibatch of size 128, returns the full batch_size 2048. So for the 320 iterations 2048 is logged and summed up to 655360.

Shouldn’t this be logged differently and elsewhere, or is there another angle how I could interpret this number?

Daraan · September 16, 2025, 10:48pm

I’ve opened PR [rrlib] Fix incorrect log value of environment steps sampled/trained by Daraan · Pull Request #56599 · ray-project/ray · GitHub to address this.

Topic		Replies	Views
Num_agent_steps_trained: 0 Configure Algorithm, Training, Evaluation, Scaling	2	258	May 4, 2024
Is there a way to set num_env_steps_sampled? RLlib	1	542	June 23, 2023
Num_env & agent_steps_trained 0 even though steps sampled? RLlib	7	931	April 25, 2024
Num_agent_steps less than num_env_steps RLlib	0	227	July 15, 2021
Get the number of training steps when loading a trained agent RLlib	2	618	March 16, 2021

Is the NUM_ENV_STEPS_TRAINED logged incorrectly, if not how to interpret it compared to NUM_MODULE_STEPS_TRAINED?

Related topics