How to resume training from a checkpoint

Finebouche · December 11, 2023, 1:29pm

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

Hi everyone.

Using ray.tune, I am able to train my model and saving the states as training goes. However, I still haven’t found a way to reuse my trained policy on another setting. (by reusing the neural network in a different experiment).

What is the official way to do this ?

starkj · December 17, 2023, 4:24pm

@Finebouche I had the same questions a few months ago. I found it odd that Ray Tune doesn’t provide a straightforward way to start a new Tune job from an existing Tune checkpoint. You have to go through some gyrations. I eventually learned how to use a callback to restore just the policy weights from a checkpoint and proceed from there. Of course, that starts over with no optimizer params & other training loop counters, etc. But the raw NN itself can be moved forward. For details, see Tune as part of curriculum training - #14 by gjoliver. It is a bit of a circuitous discussion for a while, but about half way down you see talk of the callbacks.

Lars_Simon_Zehnder · December 19, 2023, 3:08pm

@starkj Thanks for posting this here!

@Finebouche Have you tried this approach, yet? Using Policy.from_checkpoint() is the officially way here. Also take a look into the example here.

Alex-Golod · December 21, 2023, 8:30am

@starkj , Thank you very much for posting here!

Can you give me some advises how to restore optimizer parameters to continue training from the checkpoint? And did you found out where module_state.pt and default_policy_default_optimizer.pt files are read ?

Thank you in advance!

Finebouche · December 21, 2023, 5:34pm

Hi @Lars_Simon_Zehnder, I did try the example you linked but it’s not working.

I will get a look at @starkj solution tomorrow, thanks for the help everyone

starkj · December 21, 2023, 8:10pm

@Alex-Golod I did not do anything with optimizer dats, as I just wanted the raw network params (e.g. using it as pre-training).

Alex-Golod · December 22, 2023, 8:20am

@starkj Thank you for your reply. In Ray 2.9.0 I have found out this new functionality. It may help with optimizer parameters restore: Learner (Alpha) — Ray 2.9.0

Topic		Replies	Views
Another tune after restoring a PPO algorithm Checkpointing, Restoring	2	313	December 15, 2023
Restoring RLlib Run Using Tuner.restore RLlib	5	643	February 17, 2024
Retraining a loaded checkpoint using Tuner.fit() with different config Ray Tune	6	1280	October 25, 2022
RLLib Multiagent: Load only one policy from checkpoint & Compatibility of RLLib/Tune Checkpoints RLlib	9	3321	November 24, 2021
Fails restoring weights #41508 RLlib	2	425	December 29, 2023

How to resume training from a checkpoint

Related topics