Disclaimer: I’m new to ray and ray tune so bear with me if I miss some essentials.
I’ve looked at the examples for ray-tune directly and the xgboost_ray extension as well.
In both examples for tuning the “default” xgboost api is used and not the scikit-learn api. Am I correct in the assumption that using scikit-learn API does not work with tuning?
I so far tried this:
def tune(self, num_samples=10, max_t=10):
scheduler = ASHAScheduler(
max_t=max_t, # 10 training iterations
grace_period=1,
reduction_factor=2)
self.analysis = tune.run(
self._train(),
metric="mean_metric",
mode=self.metric_mode,
# set to 1 actor and 6 cpu per actor
resources_per_trial=self.ray_params.get_tune_resources(),
config=self.config,
num_samples=num_samples,
scheduler=scheduler)
return self.analysis
def _train(self):
clf = RayXGBClassifier(
objective="binary:logistic",
use_label_encoder=False,
# In XGBoost-Ray, n_jobs sets the number of actors
# 1 = single machine
n_jobs=1,
**self.config
)
scores = cross_val_score(clf, self.train_x, self.train_y, scoring=self.scorer, cv=5)
tune.report(mean_metric=scores.mean(), done=True)
This is part of a class that handles all the parameter inputs. Anyway if I run this by calling tune()
method, it starts and just never stops and cpu usage is at 0, so nothing seems to be happening at all, also no error message.
How can I “enable” cross-validation per parameter config as I don’t want to overfit on a single split?
EDIT:
I now found an example on tune-sklearn
which I can’t link here as I’m only allowed to put 2 links into my post for whatever inane reason.
I adjusted the example to my needs but I get the exact same effect, the method never completes but cpu usage is at 0.
def tune(self, n_estimators=100, iterations=10):
xgb = XGBClassifier(
n_estimators=n_estimators,
objective="binary:logistic",
nthread=6
# tree_method="gpu_hist" # this enables GPU.
# See https://github.com/dmlc/xgboost/issues/2819
)
self.tune_search = TuneSearchCV(xgb,
param_distributions=self.params,
n_trials=iterations,
early_stopping=True, # uses Async HyperBand if set to True
max_iters=10,
search_optimization="optuna",
cv=5,
scoring=self.scorer,
mode=self.metric_mode
)
print("Start Search")
self.tune_search.fit(self.train_x, self.train_y)
print("End Search")
EDIT 2:
In fact for the second example just above it takes several minutes till I finally get a very long “stack trace” with this error at the end:
RuntimeError: Unable to connect to Redis at 127.0.0.1:6379 after 16 retries. Check that 127.0.0.1:6379 is reachable from this machine. If it is not, your firewall may be blocking this port. If the problem is a flaky connection, try setting the environment variable
RAY_START_REDIS_WAIT_RETRIES
to increase the number of attempts to ping the Redis server.
What does this mean? I I need to have redis installed and started?
UPDATE:
Turns out I have another service listeing at 6379. My question is how can I set the redis port by code in above example?