How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
How can I set initial collect steps for rollout when using rllib? I am training with the complete episodes mode and trainer_config[“min_train_timesteps_per_reporting”] = 0 and trainer_config[“min_sample_timesteps_per_reporting”] = 0 so each .train() rolls out one episode and then trains. However, for the first time train() is called, around 40 episodes get rolled out at once and put into the replay buffer.
How is this number determined and is it possible to set this initial collect step to some other number?