Tune as part of curriculum training

arturn · March 13, 2023, 5:21pm

I meant to ask whether all runs well without the checkpoint. But you answered that already!
In order to progress on this, you should cast this into a minimal reproduction script.
I don’t think it has anything to do with your environment and it’s highly unlikely that it has anything todo with almost all of the configuration you do.
Since it is working on my end (saving and loading checkpoints with your script): Could you try to do this inside of one script with a toy environment and no further configuration and see if that runs?

gjoliver · March 14, 2023, 8:00pm

also, sorry about the delay in getting back.
one thing I noticed is that in this callback, can you not reload the whole algorithm?
the algorithm directory has a policies/ sub-dir, and it contains all the policies you have trained with that run of PPO. there is likely a default/ folder in there as well, since that’s the name of the policy if you don’t do multi-agent.
so in the callback, can you do something like Policy.from_checkpoint("<your algorithm checkpoint dir>/policies/default/")?

I’d also like to share an example with you for something like this: https://github.com/ray-project/ray/blob/master/rllib/examples/connectors/self_play_with_policy_checkpoint.py#L54-L68

starkj · March 15, 2023, 4:28am

@gjoliver @arturn thanks for the new suggestions. It appears I was able to get the checkpoint restored successfully by restoring the Policy checkpoint instead of the Algorithm checkpoint. Even with a very simplified script, I still found the Algorithm restoration to hang. But I am now able to move on. Thank you very much.

arturn · March 17, 2023, 10:15pm

If anyone experiences similar issues, please add to this thread. Thank you.

Alex-Golod · December 19, 2023, 9:43am

Hi everyone! I’ve posted two issues according to this topic: RLlib| Algorithm.from_checkpoint doesn't work correctly · Issue #41290 · ray-project/ray · GitHub and [RLlib] Issue with /rllib/rllib-saving-and-loading-algos-and-policies · Issue #40347 · ray-project/ray · GitHub. With the DreamerV3 algorithm I cann’t restore models weights with Algorithm.from_checkpoint(checkpoint_path). After two different restores from the same checkpoint I have different weights. With the PPO algorithm weights are restoring correctly by Algorithm.from_checkpoint(checkpoint_path), but we think, that optimizer parameters are not restoring correctly and we loose all the metrics in 2-5 training iteration steps.

If you have any idea how to restore all parts of the algorithm from checkpoint I\ll be very happy to hear them.

Thank you in advance.

kaige_yang · February 4, 2024, 3:37am

Maybe this post is helpful

Topic		Replies	Views
Resuming/extending rllib tune experiments Checkpointing, Restoring	4	445	November 4, 2023
Another tune after restoring a PPO algorithm Checkpointing, Restoring	2	305	December 15, 2023
Re-train algorithm from checkpoint with tuner.fit()	3	371	July 18, 2023
RLLib Multiagent: Load only one policy from checkpoint & Compatibility of RLLib/Tune Checkpoints RLlib	9	3311	November 24, 2021
Retraining a loaded checkpoint using Tuner.fit() with different config Ray Tune	6	1270	October 25, 2022

Tune as part of curriculum training

Related topics