tune.Tuner.restore bug?

Jorgen_Svane · November 30, 2022, 11:12am

Hi

Ray 2.1.0

It seems to me that when trying to restore a failed run utilizing a scheduler then using something like this:

tuner = tune.Tuner.restore(“…”)
results = tuner.fit()

fails.

When using num_samples bigger than the resources available the previously paused trials don’t restart when it should be their turn nor does the scheduler seem to be reactivated (here PB2).

Output of initial fit():

Above you can see the PBT algo (here PB2) running with checkpoints and perturbs.
However, when trying to restore a failed run (stopped by entering ctrl c) this seems to fail with output looking like this:

and only the running trial from before is restarted - not the ones that were paused when the “fail” occurred even though way more times steps were preformed beyond the paused ones.

Sample code for running this can be found here

BR

Jorgen

justinvyu · December 2, 2022, 6:22pm

Hey @Jorgen_Svane,

Thanks for the detailed summary! This brings up two issues that need to be fixed:

Schedulers are not loaded back correctly on restoration when using Tuner.restore(). The restored experiment defaults to the FIFOScheduler as you can see from the status log. The FIFO scheduler doesn’t handle paused trials, which is why you only see the running trial making progress. This will happen if you use any scheduler - not just PBT.
Most schedulers such as PBT/PB2 don’t have save/restore functionality implemented. Will be looking into this and keep you updated.

I’ve opened up an issue on github here: [Tune] `Tuner.restore` doesn't restore schedulers properly · Issue #30838 · ray-project/ray · GitHub.

Topic		Replies	Views
Not able to resume experiment Ray Tune	5	960	December 12, 2022
Saving and restoring a trial state with TrialScheduler	3	169	October 24, 2023
Resuming tune optimization from previously explored configurations	2	895	October 3, 2023
Restoring Tuned Tuner RLlib	4	43	July 22, 2024
Does Ray Tune restore ignore max_concurrent_trials when restarting errored trials? Ray Tune	2	272	June 30, 2023

tune.Tuner.restore bug?

Related topics