Hyperopt with Ray Tune vs using Hyperopt directly

Hi,

I want to use Hyperopt within Ray in order to parallelize the optimization and use all my computer resources. However, I found a difference in the behavior when running Hyperopt with Ray and Hyperopt library alone. When I optimize with Ray, Hyperopt doesnโ€™t iterate over the search space trying to find the best configuration, but it only runs one iteration and stops.

I created two small scripts (based on hyperopt_example.py) that can be run to see what I am saying above:

Hyperopt with Ray:

import ray
from ray import tune
from ray.tune.suggest import ConcurrencyLimiter
from ray.tune.schedulers import AsyncHyperBandScheduler
from ray.tune.suggest.hyperopt import HyperOptSearch
def easy_objective(config):
    # Hyperparameters
    width, height = config["width"], config["height"]
    # Evaluate function
    result_objective = (0.1 + width / 100)**(-1) + height * 0.1
    # Feed the score back back to Tune.
    tune.report(mean_loss=result_objective)
if __name__ == "__main__":
    ray.init(configure_logging=False)
    algo = HyperOptSearch()
    algo = ConcurrencyLimiter(algo, max_concurrent=4)
    scheduler = AsyncHyperBandScheduler()
    analysis = tune.run(
        easy_objective,
        search_alg=algo,
        scheduler=scheduler,
        metric="mean_loss",
        mode="min",
        num_samples=20,
        config={
            "width": tune.uniform(0, 20),
            "height": tune.uniform(-100, 100),
        })
    print("Best hyperparameters found were: ", analysis.best_config)

Hyperopt itself:

from hyperopt import Trials, fmin, tpe, hp
def easy_objective(config):
    # Hyperparameters
    width, height = config["width"], config["height"]
    # Evaluate function
    result_objective = (0.1 + width / 100)**(-1) + height * 0.1
    # Feed the score back back to Tune.
    return result_objective
if __name__ == "__main__":
    space = {
        "width": hp.uniform("width", 0, 20),
        "height": hp.uniform("height", -100, 100),
    }
    max_evals = 20
    trials = Trials()
    best = fmin(
        easy_objective, space=space, algo=tpe.suggest,
        trials=trials, max_evals=max_evals)
    print(best)

The first script runs 20 trials in parallel but each one of them doesnโ€™t iterate, so the search is random and not taking advantage of Hyperopt implementation:

Output from the first script:

Memory usage on this node: 6.1/31.4 GiB
Using AsyncHyperBand: num_stopped=14
Bracket: Iter 64.000: None | Iter 16.000: None | Iter 4.000: None | Iter 1.000: -2.6052516365738967
Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/16.16 GiB heap, 0.0/5.57 GiB objects
Current best trial: 7e58507c with mean_loss=-3.863864546812083 and parameters={'width': 14.091504864328861, 'height': -80.14705291844379}
Number of trials: 20/20 (20 TERMINATED)
+-------------------------+------------+-------+-----------+----------+-----------+--------+------------------+-----------------+
| Trial name              | status     | loc   |    height |    width |      loss |   iter |   total time (s) |   neg_mean_loss |
| easy_objective_7e3c18e4 | TERMINATED |       |  24.937   |  9.96676 |  7.50203  |      1 |      0.000235796 |       -7.50203  |
| easy_objective_7e3f21b0 | TERMINATED |       |  31.1021  | 10.4803  |  7.99295  |      1 |      0.000176191 |       -7.99295  |
| easy_objective_7e58507c | TERMINATED |       | -80.1471  | 14.0915  | -3.86386  |      1 |      0.000161171 |        3.86386  |
| easy_objective_7e64e710 | TERMINATED |       |  78.8973  | 16.4909  | 11.6646   |      1 |      0.000114441 |      -11.6646   |
| easy_objective_7e69f0ac | TERMINATED |       | -17.2318  | 13.4166  |  2.54729  |      1 |      0.000140667 |       -2.54729  |
| easy_objective_7e72e25c | TERMINATED |       |  -2.67024 | 17.5931  |  3.35707  |      1 |      0.000152349 |       -3.35707  |
| easy_objective_7ea4f026 | TERMINATED |       |  31.2219  |  1.51971 | 11.803    |      1 |      0.000181675 |      -11.803    |
+-------------------------+------------+-------+-----------+----------+-----------+--------+------------------+-----------------+
Best hyperparameters found were:  {'width': 14.091504864328861, 'height': -80.14705291844379}

(I reduced the table to only have some trials here)

The second script, running Hyperopt library directly, iterates 20 times to find the best configuration. Output from the second script:

100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 20/20 [00:00<00:00, 823.61trial/s, best loss: -5.965805313009741]
{'height': -98.2634138673496, 'width': 15.903138344075911}

I would like to have the same behavior in Ray Tune as the second script, I would like that trials in parallel also iterate employing Hyperopt algorithm so it can find the optimal hyperparameter configuration. Am I missing an argument in tune.run() to set the number of iterations? In the documentation I couldnโ€™t find it.

Any help or thoughts would be really appreciated.

You should be able to increase num_samples=100 (or something) and set n_initial_trials to something lower (i.e., 10). This will allow for 10 random search steps and 90 search steps.

1 Like

Hi, Iโ€™m also using ray with hyperopt, based on the eg. given in rayโ€™s website. So itโ€™s similar to the one given above. How do we know if Hyperopt does or doesnโ€™t iterate over the search space trying to find the best configuration? Or it runs one iteration and stops?

My output is something like this:

Trial status: 6 TERMINATED | 4 RUNNING
Current time: 2025-07-02 21:01:55. Total running time: 9min 1s
Logical resource usage: 4.0/64 CPUs, 0/1 GPUs (0.0/1.0 accelerator_type:RTX)
Current best trial: 13d748cd with mean_loss=inf and params={โ€˜inj_sizeโ€™: 0.8125, โ€˜suc_sizeโ€™: 1.625, โ€˜inj_posโ€™: 5.0, โ€˜suc_posโ€™: 75.0}
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Trial name status inj_size suc_size inj_pos suc_pos loss iter total time (s) โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ obj_fn_max_clcd_b2481019 RUNNING 1.13623 1.80154 6.67009 70.5346 โ”‚
โ”‚ obj_fn_max_clcd_64a77b85 RUNNING 1.2856 1.87866 5.68852 81.8336 โ”‚
โ”‚ obj_fn_max_clcd_1cd34f1b RUNNING 0.77516 1.45742 4.88232 84.3437 โ”‚
โ”‚ obj_fn_max_clcd_129d3b86 RUNNING 0.719128 1.21721 3.82652 72.8304 โ”‚
โ”‚ obj_fn_max_clcd_13d748cd TERMINATED 0.8125 1.625 5 75 inf 1 258.168 โ”‚
โ”‚ obj_fn_max_clcd_3cb1ce60 TERMINATED 0.905043 2.3241 6.00982 77.9068 inf 1 260.269 โ”‚
โ”‚ obj_fn_max_clcd_5aa0d534 TERMINATED 0.37743 0.808731 5.48265 75.7197 inf 1 273.151 โ”‚
โ”‚ obj_fn_max_clcd_29e9036b TERMINATED 0.916731 2.56642 4.75673 77.9306 inf 1 267.198 โ”‚
โ”‚ obj_fn_max_clcd_cc0c04d9 TERMINATED 0.835354 1.77404 4.43513 70.2674 inf 1 268.93 โ”‚
โ”‚ obj_fn_max_clcd_1d33b410 TERMINATED 1.26807 1.40047 6.39778 76.1896 inf 1 265.539 โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
I can find the num_samples in the above code, but where is the โ€œn_initial_trialsโ€?

Based on rayโ€™s hyperopt eg, in my code, thereโ€™s:

initial_params = [
{โ€œinj_sizeโ€: 0.8125, โ€œsuc_sizeโ€: 1.625, โ€œinj_posโ€: 5., โ€œsuc_posโ€ : 75.},

Does it mean that I have to give 10 initial sets of values to the initial_params array?

Thanks!

Btw, is there anyway to let hyperopt communicate across diff workers so as to let it know the current best parameters and then continue the iteration? I thought that would be more effective.

So in a single hyperopt simple run, we give it a set of starting parameters and let it run 100 iterations. It uses the prev stepโ€™s result to choose a new set of parameters for the next run.