How to manipulate the `training iteration` for each trial in Ray Tune

Hi, I want to obtain the detailed confidence value for each evaluation example during successive half (ray/hyperband.py at master · ray-project/ray · GitHub).

To achieve this, I use a indicator in config, and in my TrainerClass, I use this indicator to execute the training or evaluation. However, this step will increase training_iteration. What I want to do is to freeze the training_iteration. Any possible solutions to this?

Tune will immediately log the result return from step under its own training iteration. So, to achieve batching multiple calls to step as a one “report”, you probably will want to override the train method of Trainable.

First, I want to make sure I’m understanding your problem correctly first, then I can give you some starting points for how to do this:

  • Is this an accurate summary of what you want to achieve?

You want to obtain some metrics from the scheduler while it’s doing successive halving, which happens on on_trial_result (after the training step has already been performed), but the problem is that you want to append some metrics to be reported on the same iteration.

  • Could you clarify what you mean between the indicator switching between training and evaluation?

Thanks for your reply.
First, your summary is correct.

Second, what I want to do is to collect the validation accuracy of multiple trials, and then use these values to compute an indicator which suggests whether all trials should conduct a evaluation on another reserved dataset. In this scenario, step function will execute evaluation instead of model training. However, when all trials execute step function, it will increase the training_iteration, this will mislead TrialScheduler (e.g., HyberBand) to make scheduling decisions.

got you.
I think there is more than how flexible our abstraction about training_iteration is. It’s also about how to tell Scheduler to ignore certain reported result (as for validation round, you probably wouldn’t want scheduler to act on that).
Unfortunately, I don’t think we support your scenario today out of box.