Rllib, parameters for nb_steps_warmup?

Hi there, I’m wondering is there a parameter to control the training of rl agent after a certain steps in trainer.train() and tune.run(), like in keras-rl package nb_steps_warmup,
see
a use case
internal details
Thanks in advance.

1 Like

Some algos have a learning_starts parameter. Those that use a replay buffer. For on-policy algos, such a setting wouldn’t really make sense, since samples from timesteps less than learning_starts would simply be discarded w/o any effect on anything.

1 Like

Ohh… Thanks! That could make some sense.Originally, I intend to let the agent explore more, and see more, coz recentlly, I discovered the PPO for my custom env (large action and state space) stuck in a local optimum, and rarely it can get out of it and obtain a better result. Maybe I should first try to incease the train_batch_size other than the default configuration (just start to learn rl, and quite confused, lol)?