[Tune] Feature request: Using Ray Tune for K-Fold with Repeated Grid Sampling

Calvin · June 16, 2021, 11:21pm

I was trying to use Ray Tune together with my K-Fold index k as the Hyperparameter to generate a table of accuracy for comparison across the K-Folds. The k index is used to select the dataset to be used. However, while putting it to Grid Search, it does run each k for once, but with different remaining parameters.

Is it possible to include a new search space API, which repeat each sampled parameter set for each k?

eg.
Instead of…

config = {"p1": tune.loguniform(lr_min, lr_max),
          "p2": tune.choice([1,2,3]),
          "k": tune.grid_search([*range(k)]),}

k = 1, p1=0.2, p2=1
k = 2, p1=0.6, p2=3
k = 3, p1=0.1, p2=3

or

config = {"p1": tune.grid_search([0.1,0.2,0.6]),
          "p2": tune.grid_search([1,2,3]),
          "k": tune.grid_search([*range(k)]),}

k = 1, p1=0.2, p2=1
k = 2, p1=0.2, p2=1
k = 3, p1=0.2, p2=1
k = 1, p1=0.6, p2=3
k = 2, p1=0.6, p2=3
k = 3, p1=0.6, p2=3
k = 1, p1=0.1, p2=3
k = 2, p1=0.1, p2=3
k = 3, p1=0.1, p2=3

suggestion:

config = {"p1": tune.loguniform(lr_min, lr_max),
          "p2": tune.choice([1,2,3]),
          "k": tune.repeated_grid_search([*range(k)]),}

k = 1, p1=0.2, p2=1
k = 2, p1=0.2, p2=1
k = 3, p1=0.2, p2=1
k = 1, p1=0.6, p2=3
k = 2, p1=0.6, p2=3
k = 3, p1=0.6, p2=3
k = 1, p1=0.1, p2=3
k = 2, p1=0.1, p2=3
k = 3, p1=0.1, p2=3

Although, the output of the second case looks the same as the suggestion. But the suggestion example uses random sampling for other parameters, whereas in the second case it uses predefined grid as if we don’t make any real-time results on the choices. Unlike the grid search case, the selection of parameter is not done dynamically due to past results.

Here, in the suggestion example p1 and p2 are still sampled by the algorithm to optimize the tunning.
Also, it would be even better if there is a feature to allow the dynamic hyperparameter sampling to be done based on the average performance across the k-fold.

architkulkarni · June 17, 2021, 4:56am

cc @kai – thoughts about this?

kai · June 17, 2021, 10:16am

Hi @Calvin, I’ve seen this request before, and we were tracking it here: [tune] Support resolving grid search variables before random samples · Issue #15126 · ray-project/ray · GitHub

I just submitted a PR here that will introduce the functionality, though with a different API: [tune] Add option to keep random values constant over grid search by krfricke · Pull Request #16501 · ray-project/ray · GitHub

Will this solve your request?

Calvin · June 17, 2021, 11:35am

@kai Yes and no.

It would be great that the repeated sampling can be solved in the future, but in addition to that, I am also suggesting the search can make use of the average performance over the anchored parameter k during the search. This seems to be an independent feature on top of the previous thread.

I am new to Ray Tune. Correct me if I am wrong, I thought Ray Tune sample the parameters dynamically during the search based on the performance returned from finished hyperparameter set. In the other word, if we can compare the average performance of all k-folds before proceeding to sampling the next set of randomly selected hyperparameters, the results should hopefully be more reliable.

Topic		Replies	Views
[tune] multiple runs with same hyperparameter, different random seed Ray Tune	4	2064	January 29, 2021
Is there a way to run the same hyperparameter configuration multiple times? Ray Tune	13	680	April 9, 2021
Repeat each hyperparam-group with k seeds Ray Tune	1	322	February 1, 2024
[raytune] Use Repeater with BasicVariantGenerator Ray Tune	5	572	March 24, 2023
Nested Cross Validation with Ray and Tune Ray Core	6	2897	March 19, 2021

[Tune] Feature request: Using Ray Tune for K-Fold with Repeated Grid Sampling

Related topics