I have tried to use the agent.restore(“path to checkpoint”) and then continue training with tune.run(self.agent,…hyperparameters here). I do get a message logged to the console telling me that my agent has been restored, but it seems to create a new directory (i.e. experiment with it’s own metadata) and runs in a perpetual loop, periodically printing out info statements about the task (essentially saying that it’s still pending forever).
This happens even when I add something like 25 episodes (which should take at the most 4 minutes to train) and just constantly prints out the info statement above without doing any actual training. Furthermore, the new directory that is created by running tune.run again by passing in the trained agent is completely empty.
Can someone point me in the right direction as to how I can do both
- Properly restore a “fully” trained agent (i.e. one that completed its training loop previously).
- Continue training this “fully” trained agent for some more training iterations and update only the directory and metadata of this fully trained agent rather than creating a completely new directory?