Hyperopt with Ray Tune vs using Hyperopt directly

Hi,

I want to use Hyperopt within Ray in order to parallelize the optimization and use all my computer resources. However, I found a difference in the behavior when running Hyperopt with Ray and Hyperopt library alone. When I optimize with Ray, Hyperopt doesn’t iterate over the search space trying to find the best configuration, but it only runs one iteration and stops.

I created two small scripts (based on hyperopt_example.py) that can be run to see what I am saying above:

Hyperopt with Ray:

import ray
from ray import tune
from ray.tune.suggest import ConcurrencyLimiter
from ray.tune.schedulers import AsyncHyperBandScheduler
from ray.tune.suggest.hyperopt import HyperOptSearch
def easy_objective(config):
    # Hyperparameters
    width, height = config["width"], config["height"]
    # Evaluate function
    result_objective = (0.1 + width / 100)**(-1) + height * 0.1
    # Feed the score back back to Tune.
    tune.report(mean_loss=result_objective)
if __name__ == "__main__":
    ray.init(configure_logging=False)
    algo = HyperOptSearch()
    algo = ConcurrencyLimiter(algo, max_concurrent=4)
    scheduler = AsyncHyperBandScheduler()
    analysis = tune.run(
        easy_objective,
        search_alg=algo,
        scheduler=scheduler,
        metric="mean_loss",
        mode="min",
        num_samples=20,
        config={
            "width": tune.uniform(0, 20),
            "height": tune.uniform(-100, 100),
        })
    print("Best hyperparameters found were: ", analysis.best_config)

Hyperopt itself:

from hyperopt import Trials, fmin, tpe, hp
def easy_objective(config):
    # Hyperparameters
    width, height = config["width"], config["height"]
    # Evaluate function
    result_objective = (0.1 + width / 100)**(-1) + height * 0.1
    # Feed the score back back to Tune.
    return result_objective
if __name__ == "__main__":
    space = {
        "width": hp.uniform("width", 0, 20),
        "height": hp.uniform("height", -100, 100),
    }
    max_evals = 20
    trials = Trials()
    best = fmin(
        easy_objective, space=space, algo=tpe.suggest,
        trials=trials, max_evals=max_evals)
    print(best)

The first script runs 20 trials in parallel but each one of them doesn’t iterate, so the search is random and not taking advantage of Hyperopt implementation:

Output from the first script:

Memory usage on this node: 6.1/31.4 GiB
Using AsyncHyperBand: num_stopped=14
Bracket: Iter 64.000: None | Iter 16.000: None | Iter 4.000: None | Iter 1.000: -2.6052516365738967
Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/16.16 GiB heap, 0.0/5.57 GiB objects
Current best trial: 7e58507c with mean_loss=-3.863864546812083 and parameters={'width': 14.091504864328861, 'height': -80.14705291844379}
Number of trials: 20/20 (20 TERMINATED)
+-------------------------+------------+-------+-----------+----------+-----------+--------+------------------+-----------------+
| Trial name              | status     | loc   |    height |    width |      loss |   iter |   total time (s) |   neg_mean_loss |
| easy_objective_7e3c18e4 | TERMINATED |       |  24.937   |  9.96676 |  7.50203  |      1 |      0.000235796 |       -7.50203  |
| easy_objective_7e3f21b0 | TERMINATED |       |  31.1021  | 10.4803  |  7.99295  |      1 |      0.000176191 |       -7.99295  |
| easy_objective_7e58507c | TERMINATED |       | -80.1471  | 14.0915  | -3.86386  |      1 |      0.000161171 |        3.86386  |
| easy_objective_7e64e710 | TERMINATED |       |  78.8973  | 16.4909  | 11.6646   |      1 |      0.000114441 |      -11.6646   |
| easy_objective_7e69f0ac | TERMINATED |       | -17.2318  | 13.4166  |  2.54729  |      1 |      0.000140667 |       -2.54729  |
| easy_objective_7e72e25c | TERMINATED |       |  -2.67024 | 17.5931  |  3.35707  |      1 |      0.000152349 |       -3.35707  |
| easy_objective_7ea4f026 | TERMINATED |       |  31.2219  |  1.51971 | 11.803    |      1 |      0.000181675 |      -11.803    |
+-------------------------+------------+-------+-----------+----------+-----------+--------+------------------+-----------------+
Best hyperparameters found were:  {'width': 14.091504864328861, 'height': -80.14705291844379}

(I reduced the table to only have some trials here)

The second script, running Hyperopt library directly, iterates 20 times to find the best configuration. Output from the second script:

100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 20/20 [00:00<00:00, 823.61trial/s, best loss: -5.965805313009741]
{'height': -98.2634138673496, 'width': 15.903138344075911}

I would like to have the same behavior in Ray Tune as the second script, I would like that trials in parallel also iterate employing Hyperopt algorithm so it can find the optimal hyperparameter configuration. Am I missing an argument in tune.run() to set the number of iterations? In the documentation I couldn’t find it.

Any help or thoughts would be really appreciated.

You should be able to increase num_samples=100 (or something) and set n_initial_trials to something lower (i.e., 10). This will allow for 10 random search steps and 90 search steps.

1 Like