Is there a way to set num_env_steps_sampled?

  • Low: It annoys or frustrates me for a moment.

I’m now trying to train a PPO agent in custom env, the step() will cost about 3min.

algo = PPOConfig()
algo = algo.environment(env =CustomEnv)
algo = algo.framework('torch').build().train()

I notice that in result of train there has num_env_steps_sampled

...
num_agent_steps_sampled: 4000
num_agent_steps_trained: 4000
num_env_steps_sampled: 4000
num_env_steps_trained: 4000
num_env_steps_sampled_this_iter: 4000
num_env_steps_trained_this_iter: 4000
timesteps_total: 4000
num_steps_trained_this_iter: 4000
...

Is there a way to set num_env_steps_sampled?(for PPO and other built-in algorithms)
What’s the best practice of training in expensive step() env?

Hi @radillus ,

The amount of steps sampled at minimum is mainly determined by the rollout_fragment_length, the number of envs per rollout worker and the number of workers. RLlib does not check if a maximum is reached on every step. The rollout workers just collect the fragments and once send them back to the main Algorithm instance.

Then, the number of samples that are sampled on each iteration is also bound by the batch size.
RLlib will collect at least as many samples as the batch size dictates.

Play around with these numbers to see what happens if you are not clear about it :slight_smile: