[Tune] Control TuneSearchCV reporting

tkmamidi · May 7, 2021, 9:50pm

|2021-05-07 16:40:44,930|WARNING util.py:161 -- Processing trial results took 27.238 s, which may be a performance bottleneck. Please consider reporting results less frequently to Ray Tune.|
|---|---|
|2021-05-07 16:40:44,930|WARNING util.py:161 -- The `process_trial` operation took 27.239 s, which may be a performance bottleneck.|
|2021-05-07 16:40:45,684|WARNING ray_trial_executor.py:666 -- Over the last 60 seconds, the Tune event loop has been backlogged processing new results. Consider increasing your period of result reporting to improve performance.|

How do I control reporting and checkpointing? I know that we can do via tune.run but is there an option with TuneSearchCV?

Thanks in advance!

rliaw · May 19, 2021, 9:08pm

Hmm, I think there is no option to do this for TuneSearchCV. What does your script look like?

tkmamidi · May 20, 2021, 2:24pm

def data_parsing():
 ... 
 ...
 return X_train, X_test, Y_train, Y_test

def tuning(X_train, X_test, Y_train, Y_test):
  model = GradientBoostingClassifier()
  config = {
                "n_estimators" : tune.randint(1, 200),
                "min_samples_split" : tune.randint(2, 100),
                "min_samples_leaf" : tune.randint(1, 100),
                "max_features" : tune.choice(["sqrt", "log2"]),
                "max_features" :  tune.randint(1, 10),
                "subsample" : tune.uniform(0.1, 1.0),
                'learning_rate': tune.loguniform(0.01, 1.0),
                "max_depth" : tune.randint(2, 200)
            }
  clf = TuneSearchCV(model,
                param_distributions=config,
                n_trials=300,
                early_stopping=True,
                max_iters=1,    #max_iters specifies how many times tune-sklearn will be given the decision to start/stop training a model. Thus, if you have early_stopping=False, you should set max_iters=1 (let sklearn fit the entire estimator).
                search_optimization="bayesian",
                n_jobs=30,
                refit=True,
                cv= StratifiedKFold(n_splits=5,shuffle=True,random_state=42),
                verbose=0,
                #loggers = "tensorboard",
                random_state=42,
                local_dir="./ray_results",
                )
    clf.fit(X_train, Y_train)
    print(f'{model}_{var}:{clf.best_params_}', file=open("tuning/tuned_parameters.csv", "a"))
    clf = clf.best_estimator_
    #calc metrics
    .....
    return None

if __name__ == "__main__":
    X_train, X_test, Y_train, Y_test = data_parsing()
    tuning( X_train, X_test, Y_train, Y_test)

tkmamidi · May 24, 2021, 3:51pm

@rliaw any suggestions?
Thanks in advance!

philipjb · May 24, 2021, 6:29pm

I’m getting this warning with tune.run as well and I’m not sure what would cause it to take such a long time. It would be helpful just to have more insight into when process_trial_results run and what could slow it down.

rliaw · May 25, 2021, 2:46am

Would it be possible to run a Python profiler?

Within process_trial_results, there’s actually a variety of profiling messages for certain events (i.e., the following):

            with warn_if_slow("scheduler.on_trial_result"):
                decision = self._scheduler_alg.on_trial_result(
                    self, trial, flat_result)
            if decision == TrialScheduler.STOP:
                result.update(done=True)
            with warn_if_slow("search_alg.on_trial_result"):
                self._search_alg.on_trial_result(trial.trial_id, flat_result)
            with warn_if_slow("callbacks.on_trial_result"):
                self._callbacks.on_trial_result(
                    iteration=self._iteration,
                    trials=self._trials,
                    trial=trial,
                    result=result.copy())
       ...

Perhaps it might be something we don’t log (like a checkpointing call or a callback).

philipjb · May 27, 2021, 5:55pm

I won’t have a chance to profile this for a little while, but maybe it’s just the Bayesean optimization process taking awhile to fit the KDEs (I’m using BOHB)?

Topic		Replies	Views
How to report results less frequently for TuneGridSearchCV? Ray Tune	1	766	August 10, 2022
Ray Tune event loop backlogged, slow with checkpointing Ray Tune	7	1628	September 28, 2021
Tune Performance Bottlenecks Ray Tune	8	3607	February 8, 2021
Trying to optimize training but finding documentation insufficient RLlib	6	743	September 11, 2022
Is tune.report without metric somthing to avoid? Ray Tune	6	1395	August 30, 2022

[Tune] Control TuneSearchCV reporting

Related topics