[RLlib] Ray RLlib config parameters for PPO

sven1977 · February 8, 2021, 8:01am

Hey @Xim_Lee ,
check out this documentation page here, where we explain all these config keys in more detail.
https://docs.ray.io/en/master/rllib-sample-collection.html

On the PPO-specific keys:
sgd_minibatch_size: PPO takes a train batch (of size train_batch_size) and chunks it down into n sgd_minibatch_size sized pieces. E.g. if train_batch_size=1000 and sgd_minibatch_size=100, then we create 10 “sub-sampling” pieces out of the train batch.
num_sgd_iter: The above sub-sampling pieces are then fed num_sgd_iter times to the NN for updating. So in the above example and if num_sgd_iter=30, we do 30 x 10 updates altogether on one single train batch.

Topic		Replies	Views
PPO configuration parameters: num_rollout_workers & train_batch_size Configure Algorithm, Training, Evaluation, Scaling	1	750	November 2, 2023
Confusing behavior in PPO training loop (train_batch_size, sgd_minibatch_size, num_sgd_iter) RLlib	1	535	July 27, 2022
Reproducibility of ray.tune with seeds RLlib	6	3055	July 26, 2022
RLLib PPO Trainer allocating additional memory on second training iteration RLlib	0	298	July 21, 2022
Understanding train_batch_size in multiagent RL RLlib	0	361	November 22, 2021

[RLlib] Ray RLlib config parameters for PPO

Related topics