Population Based Training and optimizer checkpointing

Hi all,

I am trying to run PBT to determine the optimal learning rate for my model — my question is about checkpointing the optimizer. I am currently using AdamW (in pytorch) and as mentioned here it would be nice to save the previous optimizer state, but checkpointing seems to overwrite the learning rate from the config — so the learning rate would never be able to mutate? Does Ray Tune get around this some how? If it does I am curious how. Or do you need to ensure the learning rate is updated from the config after checkpointing? Thanks in advance.

Brandon

Yeah, you just need to ensure that the learning rate is updated from the config after restoring from checkpoint.