Ray tune iter vs Autogluon epochs

How severe does this issue affect your experience of using Ray?

  • None: Just asking a question out of curiosity

I am using AutoGluon with Ray Tune for a model training task and have encountered behavior that I’m hoping to understand better.

Here’s the configuration I’ve used:

hyperparameters = {
    'optimization.max_epochs': 10,
    'model.timm_image.checkpoint_name': tune.choice(models),
    "optimization.learning_rate": tune.loguniform(1e-4, 1e-1)
hyperparameter_tune_kwargs = {
    'num_trials': 15,
    'searcher': 'random',
    'scheduler': 'FIFO'

After running the training with the above settings, I observed that the process indeed ran 15 trials as expected. However, within some trials, the ‘iter’ count reached up to 20, which seems to surpass the ‘max_epochs’ limit of 10 that I had set. This leads me to a few questions:

  1. Are ‘epochs’ and ‘iter’ unrelated within the context of AutoGluon and Ray Tune?
  2. What exactly does ‘iter’ represent, and how is it defined?
  3. Is there a way to limit the number of ‘iter’? If so, would it be meaningful to do this?
  4. Can I control the number of epochs more explicitly, or even tune the number of epochs as one of the hyperparameters?

Any insights into these discrepancies or advice on how to manage the ‘iter’ count would be greatly appreciated.

Thank you in advance for your assistance!