I’m trying to perform hyperparameter tuning for an xgboost classifier. I gather there are at least two ways to do this and I’m trying to figure out what exactly they do and how they might differ:
How exactly does early stopping work in either of those cases? Are there differences between these two implementations?
If, say, I run (2) with a classifier with n_estimator=500 and set early_stopping=True and n_trials=20 then tune will run 20 parameter combinations but not for the full 500 boosting rounds, correct?
But what determines after how many rounds the score will be checked and which trials will be prematurely abandoned? Is there a way for me to see after the fact how each of these trials was handled, i.e. when it was killed and why?
In tune.run case, early stopping will occur between boosting rounds (assuming you use the xgboost callback in Tune). The early stopping decision will be made by with any Tune scheduler you choose.
in tune-sklearn, we actually implement incremental fitting for xgboost models. The decision for stopping by default is with ASHA, but you can provide any arbitrary Tune scheduler. You can set TuneSearchCV(verbose=... to see how/when decisions are made.
That helped a lot! Mostly consolidated the understanding I had cobbled together from various docs. I hope I can ask some more clarifying questions:
So in (2) each trial will run xgboost for m boosting rounds, then check the score (which score on which dataset? mean test?) and then make a termination decision. If never terminated the trial will run until n_estimator rounds? How do I set m?
Is there a way to get at the trial histories through things other than verbose logs?
Hmm, perhaps that lingo was a bit too xgboost-specific. What I’m interested in is how to set the batch size for each of the incremental fits and how to set the maximal total number of rounds.