I’d like to be able to save a policy (not a checkpoint) that can be loaded into a trainer later on.
I have one policy which is trained over say 10000 training iterations, and I’d like to save it every 100 to be able to run experiments on later (like in the self play paper: http://arxiv.org/abs/2006.04471). Having to load a new checkpoint each time is too slow, and the files are pretty large and come with all the other policies in the multiagent config.
I saw the policy.export_model method which creates a torchscript file but I’m not sure that’s what I want.
I guess maybe a better way is to pickle the output of policy.get_state(), then have one trainer which is initialized and use policy.set_state() with the loaded state for the experiments.