# Size of batches collected from each worker. "rollout_fragment_length": 200, # Number of timesteps collected for each SGD round. This defines the size # of each SGD epoch. "train_batch_size": 4000, # Total SGD batch size across all devices for SGD. This defines the # minibatch size within each epoch. "sgd_minibatch_size": 128, # Whether to shuffle sequences in the batch when training (recommended). "shuffle_sequences": True, # Number of SGD iterations in each outer loop (i.e., number of epochs to # execute per train batch). "num_sgd_iter": 30,
Hi, i have a question about ppo neural net update.
I reckon that “train_batch_size” / “rollout_fragment_size” is the num of fragment and “rollout_fragment_size” means that lambda size of TD(lambda). Right?
Neural network is updated per “Train batch size”, for example, like below the figure if train batch size is 1000, updating period is 1000ts. Right?
Finally, which data are used to update neural network? Like the figure, Can the data that extract per mini batch size from train batch size be used to update? and What is sgd iter’s
detailed mean ?