Correct way of using tuner.restore()

wxie2013 · November 11, 2022, 10:21pm

Here is a simple example of running ray.tune and resuming the tune after the 1st run. The result from the resumed tune return exactly the same score as the 1st run. I would expect it update the score for every re-running. Am I use the restore function incorrectly? Any help is appreciated.

  import os
  from ray import tune, air
  from hyperopt import hp
  from ray.tune.search.hyperopt import HyperOptSearch
  from ray.air import session
  from ray.air.checkpoint import Checkpoint

  # 1. Define an objective function.
  def objective(config):
      score = config["a"] ** 2 + config["b"]
      session.report({'SCORE':score})


  # 2. Define a search space.
  search_space = {
      "a": hp.uniform("a", 0, 1),
      "b": hp.uniform("b", 0, 1)
      }

  raw_log_dir = "ray_log"
  raw_log_name = "example"
  log_dir = os.path.join(os.getcwd(), raw_log_dir, raw_log_name)
  if os.path.exists(log_dir) == False:
      print('--- this is the 1st run ----')
      algorithm = HyperOptSearch(search_space, metric="SCORE", mode="max")
      tuner = tune.Tuner(objective,
              tune_config = tune.TuneConfig(
                  num_samples = 2, # number of tries. too expensive for Brian2
                  search_alg=algorithm,
                  ),
              param_space=search_space,
              run_config = air.RunConfig(local_dir = raw_log_dir, name = raw_log_name) # where to save the log which will be loaded later
              )
  else: #note: restoring described here doesn't work: https://docs.ray.io/en/latest/tune/tutorials/tune-stopping.html 
      print('--- previous run exist, continue the tuning ----')
      algorithm = HyperOptSearch(search_space, metric="SCORE", mode="max")
      tuner = tune.Tuner.restore(log_dir)

  results = tuner.fit()
  print(results.get_best_result(metric="SCORE", mode="max").config)

wxie2013 · November 14, 2022, 2:52pm

This is getting a bit frustrating. The Ray.Tune document on the restore feature seems to be either outdated or not implemented yet. A concrete example that actually working would be appreciated for new users.

wxie2013 · November 14, 2022, 3:58pm

Instead of using tunrer.restore() function, I used algorithm.restore_from_dir. Now the output results from every round of run is different but the result looks random, i.e. without improvement compared to previous run. The question is which is the right method to use. Here is a version of the code using algorithm.restore_from_dir:

  import os
  from ray import tune, air
  from hyperopt import hp
  from ray.tune.search.hyperopt import HyperOptSearch


  # 1. Define an objective function.
  def objective(config):
      score = config["a"] ** 2 + config["b"]
      #SCORE 1st apparence which defines the key of the dictionary, i.e. metric="SCORE", 
      # or  return {"SCORE": score}
      tune.report(SCORE=score)  # this or the following return should work
      #return {"SCORE": score}


  # 2. Define a search space.
  search_space = {
      "a": hp.uniform("a", 0, 1),
      "b": hp.uniform("b", 0, 1)
      }

  raw_log_dir = "./ray_log"
  raw_log_name = "example"

  algorithm = HyperOptSearch(search_space, metric="SCORE", mode="max")
  if os.path.exists(os.path.join(raw_log_dir, raw_log_name)) == False:
      print('--- this is the 1st run ----')
  else: #note: restoring described here doesn't work: https://docs.ray.io/en/latest/tune/tutorials/tune-stopping.html 
      print('--- previous run exist, continue the tuning ----')
      algorithm.restore_from_dir(os.path.join(raw_log_dir, raw_log_name))

  # 3. Start a Tune run and print the best result.
  trainable_with_resources = tune.with_resources(objective, {"cpu": 8})
  tuner = tune.Tuner(objective,
          tune_config = tune.TuneConfig(
              num_samples = 2, # number of tries. too expensive for Brian2
              search_alg=algorithm,
              ),
          param_space=search_space,
          run_config = air.RunConfig(local_dir = raw_log_dir, name = raw_log_name) # where to save the log which will be loaded later
          )

  results = tuner.fit()
  print(results.get_best_result(metric="SCORE", mode="max").config)

wxie2013 · November 15, 2022, 10:28pm

According to the source code comments, the restore() function is only for “”"“Restores Tuner after a previously failed run.”. If a previous run is not failed, then the restore() function will just printout the tuned result from the previous run.

justinvyu · November 16, 2022, 4:23am

Ah, looks like you already tried the method of restoring the search algorithm - that is the way I recommended in this thread: [Tune] Scope of `Tuner.restore()` functionality is unclear in docs · Issue #30223 · ray-project/ray · GitHub.

Regarding your comment:

Now the output results from every round of run is different but the result looks random, i.e. without improvement compared to previous run.

You may need to increase the number of samples still, or consider changing the n_initial_points parameter that is passed into HyperOptSearch: see API reference here.

justinvyu · November 16, 2022, 4:42am

A few questions I have about your experience trying Tuner.restore:

What functionality did you expect from Tuner.restore, and what were the biggest gaps?
Which part of the documentation seems outdated/not implemented? What could be added to the docs to make this less confusing?

wxie2013 · November 16, 2022, 1:05pm

In the following link:
Stopping and Resuming a Tune Run — Ray 2.1.0

it says “If you’ve stopped a run and and want to resume from where you left off, you can then call Tuner.restore() like this:”. Would be clearer to add “where you left off for unfinished trials, as well as giving your the option to restart or resume errored trials.” because one has the option to stop the run pragmatically

It would be nice that restore() can continue a finished run in case a user want to add more sample. In this case, restore() can just call, e.g. algorithm.restore_from_dir(). Then one can just use restore() for continued tuning in all conditions and no need to use a different restore function in different condition.

Topic		Replies	Views
Using Tuner.restore in ray Checkpointing, Restoring	0	501	November 29, 2023
Resuming tune optimization from previously explored configurations	2	950	October 3, 2023
Resume tuning after updating search space with more hyperparameters	12	726	February 15, 2023
Not able to resume experiment Ray Tune	5	962	December 12, 2022
Saving and Restoring Ray run confusion Checkpointing, Restoring	1	14	May 5, 2025

Correct way of using tuner.restore()

Related topics