How to restore a trained agent to further train it?

I have tried to use the agent.restore(“path to checkpoint”) and then continue training with tune.run(self.agent,…hyperparameters here). I do get a message logged to the console telling me that my agent has been restored, but it seems to create a new directory (i.e. experiment with it’s own metadata) and runs in a perpetual loop, periodically printing out info statements about the task (essentially saying that it’s still pending forever).

This happens even when I add something like 25 episodes (which should take at the most 4 minutes to train) and just constantly prints out the info statement above without doing any actual training. Furthermore, the new directory that is created by running tune.run again by passing in the trained agent is completely empty.

Can someone point me in the right direction as to how I can do both

  1. Properly restore a “fully” trained agent (i.e. one that completed its training loop previously).
    and
  2. Continue training this “fully” trained agent for some more training iterations and update only the directory and metadata of this fully trained agent rather than creating a completely new directory?

Thanks so much. For reference, I have looked at this and this but none of them seem to correctly train the agent further and result in the situation above.

This is the message that shows up when I call my load() function which restores the agent:

2021-07-28 12:36:52,082	INFO trainable.py:378 -- Restored on 10.0.0.37 from checkpoint:
....checkpoint_000002\checkpoint-2
2021-07-28 12:36:52,086	INFO trainable.py:385 -- Current state after restoring: {'_iteration': 2, '_timesteps_total': None, '_time_total': 817.6041111946106, '_episodes_total': 55}

Hi @cl_tch ,

maybe this helps you with your problem.

Simon

1 Like

Will reimporting the weights to the agent and then running it with tune.run(agent, …) work? I am currently using tune.run() in my project with an episodes_total stop condition so I’d rather continue using tune instead of the Python API.

There is a restore option for tune.run, which allows you to provide a checkpoint. Tune.run will restore the created Trainer from this checkpoint and then “continue” training.

E.g.
ray.rllib.examples.unity3d_env_local.py has a --from_checkpoint option.

2 Likes

@sven1977

Thanks Sven, that works. Is there a way to set up the call to tune.run when using the restore= “path to checkpoint” parameter to make it so that the experiment actually resumes in the directory of the checkpoint rather than creating a new directory with essentially the same data and storing the checkpoint in this new directory?

Essentially what I mean is, if I specify the restore path can the further trained agent just override the data within that restore path rather than creating a new directory and storing the further trained agent’s data and checkpoint there?