Hi all,
I am trying to run PBT to determine the optimal learning rate for my model — my question is about checkpointing the optimizer. I am currently using AdamW (in pytorch) and as mentioned here it would be nice to save the previous optimizer state, but checkpointing seems to overwrite the learning rate from the config — so the learning rate would never be able to mutate? Does Ray Tune get around this some how? If it does I am curious how. Or do you need to ensure the learning rate is updated from the config after checkpointing? Thanks in advance.
Brandon