I am optimizing a tensorflow trainer object using the HyperBandForBOHB scheduler and the BOHB Searcher. While I am using BOHB I believe that my issues are generic to the Hyperband scheduler in general. In Hyperband, the algorithm creates an ideal experiment schedule based on the training budget available to it defined in “iterations”. The authors of the paper define an “iteration” as the minimum amount of optimization needed for models to start to meaningfully separate their performance. They suggest that you define it as some fraction of a single epoch. You are then expected to define the total budget for a trial as R, which some number of “iterations”. It then builds the trial schedule based on multi-armed bandit theory to balance exploring many configurations quickly and testing configs with large budgets to hedge against noisy early trial performance.
Concern 1:
In Tune there is a num_samples parameter that is a hard limit on the number of trials that Tune will run. It appears at this point in time that if you set num_samples < HB_trails, that Tune will just take the first num_samples from the Hyperband schedule. My problem with this is I believe this breaks the theory behind the Hyperband schedule. It would be nice if you could set num_samples either to None or HyperBand.compute_num_samples(R, eta). This would run the appropriate number of trials and allow the user to experiment with how R and eta affect the trial size.
*Additional question the documentation mentions both max_t and R but never explicitly says they’re the same thing, are they?
Concern 2:
The other concern is in regards to controlling the minimum budget or the “iteration”. Currently my trainer reports results back to tune every mini-batch. Because of this if I set the time_attr value as “training_iteration” the algorithm assumes my minimum budget is a single mini batch which is way to short a training time for Hyperband to work. Is it possible to create new time_attr’s like “1000_training_iterations” or something and how could that be done? Or am I required to only report results back to Tune every x training-iterations or epoch? The second option is not ideal as I log all my results using RayTunes report as well.