How to set initial collect steps?

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

How can I set initial collect steps for rollout when using rllib? I am training with the complete episodes mode and trainer_config[“min_train_timesteps_per_reporting”] = 0 and trainer_config[“min_sample_timesteps_per_reporting”] = 0 so each .train() rolls out one episode and then trains. However, for the first time train() is called, around 40 episodes get rolled out at once and put into the replay buffer.

How is this number determined and is it possible to set this initial collect step to some other number?

I think you’re looking for train_batch_size, rollout_fragment_length, and batch_mode here

Hi @pd-perry ,

What algorithm are you using and on what version of Ray? On master and in Q Learning Algorithms, num_steps_sampled_before_learning_starts determines how many steps are sampled before RLlib returns. In older version of RLlib, the replay_starts parameter of the replay_buffer_config had that job - depending on the implementation of the algorithm it would first fill the replay buffer before it would return.

Please try setting num_steps_sampled_before_learning_starts and using our latest nightly build!