Why auto-adjust `rollout_fragment_length` by a floor division instead of ceiling operation?

How severe does this issue affect your experience of using Ray?

  • Low: It annoys or frustrates me for a moment.

The PPOTrainer class auto-adjusts rollout_fragment_length by a floor division if
train_batch_size % (num_worker * num_envs_per_worker * rollout_fragment_length) != 0.
Is there a reason why rollout_fragment_length isn’t auto-adjusted by a ceiling operation?

An example:
train_batch_size = 4000
rollout_fragment_length = 200
num_worker = 7
num_envs_per_worker = 1

RLlib auto-adjusts rollout_fragment_length to 571 (result of floor division) and ends up in collecting a train batch of size 7994.
Instead, a ceiling operation, i.e. math.ceil(train_batch_size / (num_workers * num_envs_per_worker)), would yield a new rollout_fragment_length of 572 and the collected train batch would have only a size of 4004.

Sample collection per train step would be drastically reduced.

1 Like

Hi klausk55! Thanks for raising this.
That’s indeed a logical flaw. I’ll write a PR.

The PR: [RLlib] PPO automatic train batch size calculation fix by ArturNiederfahrenhorst · Pull Request #25621 · ray-project/ray · GitHub

1 Like