How to report results less frequently for TuneGridSearchCV?

Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I am using TuneGridSearchCV to tune an XGBoost model of significant size on my university’s HPC cluster. I get errors saying “Processing trial results took X s, which may be a performance bottleneck. Please consider reporting results less frequently to Ray Tune.” How can I report less frequently for TuneGridSearchCV? The results are in ray_results in my home directories which is very slow to write to. Our work directories are very fast so it might also help if I can move ray_results to a different directory.

I tried setting early_stopping = True which speeds up the process and removes the bottleneck but then I get a warning saying “UserWarning: early_stopping is enabled but max_iters = 1. To enable partial training, set max_iters > 1.” and “UserWarning: tune-sklearn implements incremental learning for xgboost models following this: incremental learning, partial_fit like sklearn? · Issue #1686 · dmlc/xgboost · GitHub. This may negatively impact performance. To disable, set early_stopping=False.” I’ve been testing on smaller parameter spaces but if I expand to the full parameter space I’m afraid it will slow down (although I don’t understand the max_iter warning as that shouldn’t apply to TuneGridSearchCV).

Essentially it is the reporting that is causing my model to slow down and fail due to running out of storage in the directory that ray_results is in so I have 3 questions (and please understand I am quite new to all this, I tried looking in the docs but I don’t exactly understand what to do):

  1. How can I report less frequently?
  2. How can I change the ray_results folder from ~/ray_results to something else?
  3. Should I implement early stopping despite the warnings?

I would really appreciate answers to any/all 3! Thank you!

import numpy as np
import pandas as pd
from pandas import MultiIndex, Int16Dtype
from sklearnex import patch_sklearn  #these lines MUST come before importing any sklearn packages

import xgboost as xgb

from tune_sklearn import TuneGridSearchCV

from datetime import datetime
import sys

if __name__ == "__main__":

        df_train = pd.read_excel('my_dataset.xlsx', sheet_name = 'Train')

        train_cols = df_train.columns[df_train.columns != 'Response']

        X_train = pd.DataFrame(df_train, columns=train_cols)
        y_train = pd.DataFrame(df_train, columns=['Response'])

        params =  {
                "n_estimators"  : list(range(100, 800, 100)),
                "max_depth"        : list(range(2, 12, 2)), #range: [0,∞] from document
                "min_child_weight" : list(range(2, 12, 2)) #range: [0,∞] from document
                #"gamma"            : np.arange(0, 1.05, 0.1), #range: [0,∞] from document
                #"colsample_bytree" : np.arange(0.5, 1.05, 0.1), #range: [0,1] from document
                #"colsample_bylevel" : np.arange(0.5, 1.05, 0.1), #range: [0,1] from document
                #'reg_lambda': [0.1, 1.0, 5.0, 10.0, 25.0, 50.0]

        xgb_model = xgb.XGBClassifier(seed=0, use_label_encoder = False, tree_method = 'hist')
        print("Tree method = hist, 74 cores, a100 gpu, 3 gpus per, hist, early stopping")

        grid_cv = TuneGridSearchCV(xgb_model, param_grid = params, early_stopping = True, cv = 5, n_jobs = -1, scoring='roc_auc', use_gpu=True)

        current_time ="%H:%M:%S")
        print("Start Time =", current_time)
        print('\n\n'), y_train.values.ravel())

        current_time ="%H:%M:%S")
        print('End Time: ', current_time)

        print('Grid best score (roc_auc): ')
        print('Grid best hyperparameters: ')
1 Like

Hi @shoopdoop
Thanks for the great question. Here is the API for TuneGridSearchCV.
To change ray_results folder you can use local_dir arg.
You can also change max_iters there. To use early_stopping properly, you will need to set max_iters > 1. You may also find this notebook helpful.