Difficulty Controlling Hyperband number of samples and minimum budget

philipjb · May 27, 2021, 2:46pm

I am optimizing a tensorflow trainer object using the HyperBandForBOHB scheduler and the BOHB Searcher. While I am using BOHB I believe that my issues are generic to the Hyperband scheduler in general. In Hyperband, the algorithm creates an ideal experiment schedule based on the training budget available to it defined in “iterations”. The authors of the paper define an “iteration” as the minimum amount of optimization needed for models to start to meaningfully separate their performance. They suggest that you define it as some fraction of a single epoch. You are then expected to define the total budget for a trial as R, which some number of “iterations”. It then builds the trial schedule based on multi-armed bandit theory to balance exploring many configurations quickly and testing configs with large budgets to hedge against noisy early trial performance.

Concern 1:
In Tune there is a num_samples parameter that is a hard limit on the number of trials that Tune will run. It appears at this point in time that if you set num_samples < HB_trails, that Tune will just take the first num_samples from the Hyperband schedule. My problem with this is I believe this breaks the theory behind the Hyperband schedule. It would be nice if you could set num_samples either to None or HyperBand.compute_num_samples(R, eta). This would run the appropriate number of trials and allow the user to experiment with how R and eta affect the trial size.
*Additional question the documentation mentions both max_t and R but never explicitly says they’re the same thing, are they?

Concern 2:
The other concern is in regards to controlling the minimum budget or the “iteration”. Currently my trainer reports results back to tune every mini-batch. Because of this if I set the time_attr value as “training_iteration” the algorithm assumes my minimum budget is a single mini batch which is way to short a training time for Hyperband to work. Is it possible to create new time_attr’s like “1000_training_iterations” or something and how could that be done? Or am I required to only report results back to Tune every x training-iterations or epoch? The second option is not ideal as I log all my results using RayTunes report as well.

philipjb · May 27, 2021, 5:47pm

Okay so for concern 2, talking to @rliaw, I realize that train_attr is actually setting a key from the report dictionary which is used to track hyperband iterations. So since I can set it to whatever I want, I can add a “hyperband_iteration” to report and iterate it when I want. This means that a lot of users using default values might have issues but it’s a powerful solution that solves my problems.

A followup question with this is, if I report my validation loss back at each mini-batch and then every 1000 mini-batches I increment “hyperband_iteration”, will BOHB evaluate my trial based on the final returned validation step or will it use a mean or moving average?

rliaw · May 27, 2021, 7:24pm

It would be nice if you could set num_samples either to None or HyperBand.compute_num_samples(R, eta).

Yes! We should file a feature request for this.

Additional question the documentation mentions both max_t and R but never explicitly says they’re the same thing, are they?

Yes, that’s right.

. Because of this if I set the time_attr value as “training_iteration” the algorithm assumes my minimum budget is a single mini batch which is way to short a training time for Hyperband to work. Is it possible to create new time_attr’s like “1000_training_iterations” or something and how could that be done?

So since I can set it to whatever I want, I can add a “hyperband_iteration” to report and iterate it when I want

Yep!

if I report my validation loss back at each mini-batch and then every 1000 mini-batches I increment “hyperband_iteration”, will BOHB evaluate my trial based on the final returned validation step or will it use a mean or moving average?

It will use the final step, so you may need to keep track of the smoothing/averaging your self.

Topic		Replies	Views
How ray tune hyperband schedule generates and stop trials?	0	226	May 5, 2023
How does Ray Bayesian Optimization HyperBand (BOHB) work?	0	92	May 17, 2024
Ray tune hyperband scheduler Ray Tune	4	341	August 4, 2022
Running Tune with nonparallel function Ray Tune	3	300	May 21, 2021
Many paused jobs without progress when using TuneBOHB Ray Tune	3	276	August 28, 2024

Difficulty Controlling Hyperband number of samples and minimum budget

Related topics