One Trainer, multiple Datasets

ddavo · November 21, 2023, 11:45am

I want to run an experiment with multiple datasets, searching the best combination of hyperparameters for my model with that dataset.

What would be better, having multiple trainers, one per dataset, or having one trainer with a grid_search parameter which is the dataset.

The second option is not compatible with some search_algorithms like HyperOpt though

justinvyu · November 21, 2023, 6:09pm

It seems like you care about the best hyperparameters per dataset. Launching multiple runs with one dataset per trainer (and tuning over whatever hyperparameters) is what I’d recommend.

ddavo · November 27, 2023, 11:35am

Thats what I’m ending up doing.

Btw, If I want to run multiple trainers (not concurrently), I do:

for t in tqdm(trainers):
    t.fit()

But the progress bar is not shown because the trainer interface overrides it. Is it possible to show what’s the “current trainer”?

Topic		Replies	Views
How to parallelize training multiple models Ray Tune	4	1145	April 5, 2021
Multiple model training/hp tune	0	227	September 21, 2023
Setting Hyperparameters for Datasets	0	175	September 7, 2023
Ray.tune - Best practices for reading datasets Ray Tune	1	549	February 18, 2022
Shared dataset on a local desktop	1	286	March 7, 2023

One Trainer, multiple Datasets

Related topics