Restoring Tuned Tuner

tg03 · July 19, 2024, 3:03pm

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

Hi,

I’ve been trying to figure this out for a while but can’t find out how to do it. I use tune.fit() to train my MAPPO model. Is there a way I can train the same model more once it has terminated? From what I can tell tuner.restore cannot be used for runs that finished training.

tuner = tune.Tuner(
      CentralizedCritic,
      param_space = config.to_dict(),
      run_config = air.RunConfig(
          stop = {"timesteps_total": 200_000},
          storage_path = storage_path,
          checkpoint_config = train.CheckpointConfig(
              checkpoint_frequency = 5,
              checkpoint_at_end = True
          )
      ),
  )
  results = tuner.fit()

Thanks in advance!

PhilippWillms · July 19, 2024, 9:15pm

For also other readers to get right understanding: You want to save a model or policy after finished training ?

Maybe it is helpful to check on the link below what RLlib can offter here:
https://docs.ray.io/en/master/rllib/rllib-saving-and-loading-algos-and-policies.html#model-exports

tg03 · July 22, 2024, 9:37am

Sorry for not making it clear. I want to tune the model until the stopping conditions are met. My question relates to then continuing to train that same model (e.g. for another 200000 steps) ideally just continuing where it was last checkpoints (e.g. w.r.t. WandB).

PhilippWillms · July 22, 2024, 5:50pm

Ok, in that case I recommend having a close look at How to Save and Load Trial Checkpoints — Ray 3.0.0.dev0 . This would also include some experimentation which kind of checkpointing fits best for you.

mbusch-regis · July 22, 2024, 7:33pm

Would this also be applicable if I want to continue training the model on an environment with different starting conditions?

I’m training using a simulation engine as my environment where episode length is a configurable number of steps and each step is a time interval. I want to be able to train my PPO model on a variety of different episode lengths and time intervals. Would I need to do something like the function checkpointing in the linked doc?

Thanks

Topic		Replies	Views
Continue training of finished trials (Tune, RLLIB, PPO) Ray Tune	3	39	May 31, 2025
Another tune after restoring a PPO algorithm Checkpointing, Restoring	2	301	December 15, 2023
Restoring RLlib Run Using Tuner.restore RLlib	5	625	February 17, 2024
Retraining a loaded checkpoint using Tuner.fit() with different config Ray Tune	6	1258	October 25, 2022
How to resume training from a checkpoint RLlib	6	1780	December 22, 2023

Restoring Tuned Tuner

Related topics