ValueError with TuneGridSearchCV

Hey everyone, I’m really new to machine learning and python so I don’t know much.

I set up a Pipeline with sklearn, and when I perform the grid search with GridSearchCV from sklearn, it works fine. But when I try to use TuneGridSearchCV without changing anything else, it always stops searching at the first error that comes up (in this case it is a ValueError from the PCA components parameter). If I’m not wrong, it is normal for all all these errors to come up, but the GridSearchCV ignores the parameters for this trial and moves on with the next (while TuneGridSearchCV stops the script and displays the first error that comes up).

The only idea I’ve had, was setting the error_score parameter to zero, but again it stops at the first error. That’s the whole warning/error output I get:

/opt/conda/lib/python3.7/site-packages/ray/tune/tune.py:369: UserWarning: The `loggers` argument is deprecated. Please pass the respective `LoggerCallback` classes to the `callbacks` argument instead. See https://docs.ray.io/en/latest/tune/api_docs/logging.html
  "The `loggers` argument is deprecated. Please pass the respective "
(pid=174) /opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py:355: UserWarning: Persisting input arguments took 0.61s to run.
(pid=174) If this happens often in your code, it can cause performance problems 
(pid=174) (results will be correct in all cases). 
(pid=174) The reason for this is probably some large input arguments for a wrapped
(pid=174)  function (e.g. large strings).
(pid=174) THIS IS A JOBLIB ISSUE. If you can, kindly provide the joblib's team with an
(pid=174)  example so that they can fix the problem.
(pid=174)   **fit_params_steps[name],
(pid=173) /opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py:355: UserWarning: Persisting input arguments took 0.60s to run.
(pid=173) If this happens often in your code, it can cause performance problems 
(pid=173) (results will be correct in all cases). 
(pid=173) The reason for this is probably some large input arguments for a wrapped
(pid=173)  function (e.g. large strings).
(pid=173) THIS IS A JOBLIB ISSUE. If you can, kindly provide the joblib's team with an
(pid=173)  example so that they can fix the problem.
(pid=173)   **fit_params_steps[name],
(pid=175) /opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py:355: UserWarning: Persisting input arguments took 0.62s to run.
(pid=175) If this happens often in your code, it can cause performance problems 
(pid=175) (results will be correct in all cases). 
(pid=175) The reason for this is probably some large input arguments for a wrapped
(pid=175)  function (e.g. large strings).
(pid=175) THIS IS A JOBLIB ISSUE. If you can, kindly provide the joblib's team with an
(pid=175)  example so that they can fix the problem.
(pid=175)   **fit_params_steps[name],
(pid=172) /opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py:355: UserWarning: Persisting input arguments took 0.64s to run.
(pid=172) If this happens often in your code, it can cause performance problems 
(pid=172) (results will be correct in all cases). 
(pid=172) The reason for this is probably some large input arguments for a wrapped
(pid=172)  function (e.g. large strings).
(pid=172) THIS IS A JOBLIB ISSUE. If you can, kindly provide the joblib's team with an
(pid=172)  example so that they can fix the problem.
(pid=172)   **fit_params_steps[name],
(pid=174) /opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py:355: UserWarning: Persisting input arguments took 0.60s to run.
(pid=174) If this happens often in your code, it can cause performance problems 
(pid=174) (results will be correct in all cases). 
(pid=174) The reason for this is probably some large input arguments for a wrapped
(pid=174)  function (e.g. large strings).
(pid=174) THIS IS A JOBLIB ISSUE. If you can, kindly provide the joblib's team with an
(pid=174)  example so that they can fix the problem.
(pid=174)   **fit_params_steps[name],
(pid=173) /opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py:355: UserWarning: Persisting input arguments took 0.60s to run.
(pid=173) If this happens often in your code, it can cause performance problems 
(pid=173) (results will be correct in all cases). 
(pid=173) The reason for this is probably some large input arguments for a wrapped
(pid=173)  function (e.g. large strings).
(pid=173) THIS IS A JOBLIB ISSUE. If you can, kindly provide the joblib's team with an
(pid=173)  example so that they can fix the problem.
(pid=173)   **fit_params_steps[name],
(pid=175) /opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py:355: UserWarning: Persisting input arguments took 0.61s to run.
(pid=175) If this happens often in your code, it can cause performance problems 
(pid=175) (results will be correct in all cases). 
(pid=175) The reason for this is probably some large input arguments for a wrapped
(pid=175)  function (e.g. large strings).
(pid=175) THIS IS A JOBLIB ISSUE. If you can, kindly provide the joblib's team with an
(pid=175)  example so that they can fix the problem.
(pid=175)   **fit_params_steps[name],
(pid=172) /opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py:355: UserWarning: Persisting input arguments took 0.62s to run.
(pid=172) If this happens often in your code, it can cause performance problems 
(pid=172) (results will be correct in all cases). 
(pid=172) The reason for this is probably some large input arguments for a wrapped
(pid=172)  function (e.g. large strings).
(pid=172) THIS IS A JOBLIB ISSUE. If you can, kindly provide the joblib's team with an
(pid=172)  example so that they can fix the problem.
(pid=172)   **fit_params_steps[name],
---------------------------------------------------------------------------
RayTaskError(ValueError)                  Traceback (most recent call last)
/tmp/ipykernel_36/325189685.py in <module>
     20 
     21 start = time.time()
---> 22 tune_search.fit(x_train, y_train)
     23 end = time.time()
     24 print(f'TuneGridSearchCV total fitting time: {round(end - start, 3)} sec')

/opt/conda/lib/python3.7/site-packages/tune_sklearn/tune_basesearch.py in fit(self, X, y, groups, tune_params, **fit_params)
    663             ray_kwargs["local_mode"] = True
    664         with ray_context(**ray_kwargs):
--> 665             return self._fit(X, y, groups, tune_params, **fit_params)
    666 
    667     def score(self, X, y=None):

/opt/conda/lib/python3.7/site-packages/tune_sklearn/tune_basesearch.py in _fit(self, X, y, groups, tune_params, **fit_params)
    568 
    569         self._fill_config_hyperparam(config)
--> 570         analysis = self._tune_run(config, resources_per_trial, tune_params)
    571 
    572         self.cv_results_ = self._format_results(self.n_splits, analysis)

/opt/conda/lib/python3.7/site-packages/tune_sklearn/tune_gridsearch.py in _tune_run(self, config, resources_per_trial, tune_params)
    288                 "ignore", message="fail_fast='raise' "
    289                 "detected.")
--> 290             analysis = tune.run(trainable, **run_args)
    291         return analysis

/opt/conda/lib/python3.7/site-packages/ray/tune/tune.py in run(run_or_experiment, name, metric, mode, stop, time_budget_s, config, resources_per_trial, num_samples, local_dir, search_alg, scheduler, keep_checkpoints_num, checkpoint_score_attr, checkpoint_freq, checkpoint_at_end, verbose, progress_reporter, log_to_file, trial_name_creator, trial_dirname_creator, sync_config, export_formats, max_failures, fail_fast, restore, server_port, resume, queue_trials, reuse_actors, trial_executor, raise_on_failed_trial, callbacks, max_concurrent_trials, loggers, _remote)
    599     progress_reporter.set_start_time(tune_start)
    600     while not runner.is_finished() and not state[signal.SIGINT]:
--> 601         runner.step()
    602         if has_verbosity(Verbosity.V1_EXPERIMENT):
    603             _report_progress(runner, progress_reporter)

/opt/conda/lib/python3.7/site-packages/ray/tune/trial_runner.py in step(self)
    703                 if self.trial_executor.in_staging_grace_period():
    704                     timeout = 0.1
--> 705                 self._process_events(timeout=timeout)
    706             else:
    707                 self._run_and_catch(self.trial_executor.on_no_available_trials)

/opt/conda/lib/python3.7/site-packages/ray/tune/trial_runner.py in _process_events(self, timeout)
    861             else:
    862                 with warn_if_slow("process_trial"):
--> 863                     self._process_trial(trial)
    864 
    865             # `self._queued_trial_decisions` now contains a final decision

/opt/conda/lib/python3.7/site-packages/ray/tune/trial_runner.py in _process_trial(self, trial)
    888         """
    889         try:
--> 890             results = self.trial_executor.fetch_result(trial)
    891             with warn_if_slow(
    892                     "process_trial_results",

/opt/conda/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py in fetch_result(self, trial)
    786         self._running.pop(trial_future[0])
    787         with warn_if_slow("fetch_result"):
--> 788             result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
    789 
    790         # For local mode

/opt/conda/lib/python3.7/site-packages/ray/_private/client_mode_hook.py in wrapper(*args, **kwargs)
    103             if func.__name__ != "init" or is_client_mode_enabled_by_default:
    104                 return getattr(ray, func.__name__)(*args, **kwargs)
--> 105         return func(*args, **kwargs)
    106 
    107     return wrapper

/opt/conda/lib/python3.7/site-packages/ray/worker.py in get(object_refs, timeout)
   1623                     worker.core_worker.dump_object_store_memory_usage()
   1624                 if isinstance(value, RayTaskError):
-> 1625                     raise value.as_instanceof_cause()
   1626                 else:
   1627                     raise value

RayTaskError(ValueError): ray::_Trainable.train_buffered() (pid=173, ip=172.19.2.2, repr=<tune_sklearn._trainable._Trainable object at 0x7f13ec0fc250>)
  File "/opt/conda/lib/python3.7/site-packages/ray/tune/trainable.py", line 224, in train_buffered
    result = self.train()
  File "/opt/conda/lib/python3.7/site-packages/ray/tune/trainable.py", line 283, in train
    result = self.step()
  File "/opt/conda/lib/python3.7/site-packages/tune_sklearn/_trainable.py", line 106, in step
    return self._train()
  File "/opt/conda/lib/python3.7/site-packages/tune_sklearn/_trainable.py", line 247, in _train
    error_score="raise")
  File "/opt/conda/lib/python3.7/site-packages/sklearn/model_selection/_validation.py", line 283, in cross_validate
    for train, test in cv.split(X, y, groups)
  File "/opt/conda/lib/python3.7/site-packages/joblib/parallel.py", line 1041, in __call__
    if self.dispatch_one_batch(iterator):
  File "/opt/conda/lib/python3.7/site-packages/joblib/parallel.py", line 859, in dispatch_one_batch
    self._dispatch(tasks)
  File "/opt/conda/lib/python3.7/site-packages/joblib/parallel.py", line 777, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/opt/conda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/opt/conda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/opt/conda/lib/python3.7/site-packages/joblib/parallel.py", line 263, in __call__
    for func, args, kwargs in self.items]
  File "/opt/conda/lib/python3.7/site-packages/joblib/parallel.py", line 263, in <listcomp>
    for func, args, kwargs in self.items]
  File "/opt/conda/lib/python3.7/site-packages/sklearn/utils/fixes.py", line 211, in __call__
    return self.function(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/sklearn/model_selection/_validation.py", line 681, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py", line 390, in fit
    Xt = self._fit(X, y, **fit_params_steps)
  File "/opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py", line 355, in _fit
    **fit_params_steps[name],
  File "/opt/conda/lib/python3.7/site-packages/joblib/memory.py", line 591, in __call__
    return self._cached_call(args, kwargs)[0]
  File "/opt/conda/lib/python3.7/site-packages/joblib/memory.py", line 534, in _cached_call
    out, metadata = self.call(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/joblib/memory.py", line 761, in call
    output = self.func(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/sklearn/pipeline.py", line 893, in _fit_transform_one
    res = transformer.fit_transform(X, y, **fit_params)
  File "/opt/conda/lib/python3.7/site-packages/sklearn/decomposition/_pca.py", line 407, in fit_transform
    U, S, Vt = self._fit(X)
  File "/opt/conda/lib/python3.7/site-packages/sklearn/decomposition/_pca.py", line 457, in _fit
    return self._fit_full(X, n_components)
  File "/opt/conda/lib/python3.7/site-packages/sklearn/decomposition/_pca.py", line 478, in _fit_full
    "svd_solver='full'" % (n_components, min(n_samples, n_features))
ValueError: n_components=10 must be between 0 and min(n_samples, n_features)=9 with svd_solver='full'

Is there anything I should change when going from GridSearchCV to TuneGridSearchCV??

Hmm, this is an explicit decision we made in TuneGridSearchCV:

Hypothetically we can make this configurable via environment variable. If you’re interested, could you post an issue or a pull-request?

same here: