Restoring Tuned Tuner

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hi,

I’ve been trying to figure this out for a while but can’t find out how to do it. I use tune.fit() to train my MAPPO model. Is there a way I can train the same model more once it has terminated? From what I can tell tuner.restore cannot be used for runs that finished training.

tuner = tune.Tuner(
      CentralizedCritic,
      param_space = config.to_dict(),
      run_config = air.RunConfig(
          stop = {"timesteps_total": 200_000},
          storage_path = storage_path,
          checkpoint_config = train.CheckpointConfig(
              checkpoint_frequency = 5,
              checkpoint_at_end = True
          )
      ),
  )
  results = tuner.fit()

Thanks in advance!

For also other readers to get right understanding: You want to save a model or policy after finished training ?

Maybe it is helpful to check on the link below what RLlib can offter here:
https://docs.ray.io/en/master/rllib/rllib-saving-and-loading-algos-and-policies.html#model-exports

Sorry for not making it clear. I want to tune the model until the stopping conditions are met. My question relates to then continuing to train that same model (e.g. for another 200000 steps) ideally just continuing where it was last checkpoints (e.g. w.r.t. WandB).

Ok, in that case I recommend having a close look at How to Save and Load Trial Checkpoints — Ray 3.0.0.dev0 . This would also include some experimentation which kind of checkpointing fits best for you.

Would this also be applicable if I want to continue training the model on an environment with different starting conditions?

I’m training using a simulation engine as my environment where episode length is a configurable number of steps and each step is a time interval. I want to be able to train my PPO model on a variety of different episode lengths and time intervals. Would I need to do something like the function checkpointing in the linked doc?

Thanks