I know that one can update learning rate over training using
lr_schedule, and I also noticed
entropy_coeff_schedule. These seem to have special handling within the policy classes.
Does there exist a more general approach to modify other training parameters, e.g.
batch_size or maybe
Or perhaps even updating some of your environment config over time? I know this makes for non-stationary training, but I can imagine cases where this might be useful.
Is the only approach to stop training and restart it with a new config?