How to report custom metric to tune while using lightgbm_ray?

Arindam_Jati · October 26, 2021, 7:17am

It is mentioned in the lightgbm_ray documentation that “Internally, train will detect if tune is being used and will automatically report results to tune.” I observed similar behavior in my experiments.

Is there a way to stop this and report a custom metric that I might calculate after I do a predict call?

xwjiang2010 · October 26, 2021, 5:50pm

What are you trying to achieve here?
The metrics are reported per epoch within train so that Tune can decide on what to do with the corresponding trial per epoch. This includes whether to pause the trial, stop the trial or continue training.

Arindam_Jati · October 26, 2021, 6:42pm

Thanks for the prompt reply.

Modifying the code snippet from here, the following is what I want to achieve. Essentially, I want to use the trained lightGBM to get the test predictions and then compute some custom error on my own, and then report that to tune. How can I achieve that in the current scenario if train automatically reports it to tune?

def train_breast_cancer(config, ray_params):
    # Load dataset
    data, labels = datasets.load_breast_cancer(return_X_y=True)
    # Split into train and test set
    train_x, test_x, train_y, test_y = train_test_split(
        data, labels, test_size=0.25)

    train_set = RayDMatrix(train_x, train_y)
    test_set = RayDMatrix(test_x, test_y)

    evals_result = {}

    bst = train(
        params=config,
        dtrain=train_set,
        valid_sets=[train_set, test_set],
        valid_names=["train", "eval"],
        evals_result=evals_result,
        ray_params=ray_params,
        verbose_eval=True,
        num_boost_round=100)

    print('-'*10, 'Saving model', '-'*10)
    print(tune.get_trial_name, tune.get_trial_id)
    model_path = "tuned.lgbm"
    bst.booster_.save_model(model_path)
    print("Final validation error: {:.4f}".format(
        evals_result["eval"]["binary_error"][-1]))

    #####################################################################################
    # Predict
    #####################################################################################
    bst = lgb.Booster(model_file=model_path) # lgb is standard lightgbm module
    pred_ray = Predict(bst, test_set, ray_params=RayParams(num_actors=NUM_ACTORS))

    #####################################################################################
    # Calculate custom loss, that does custom operations on `pred_ray`
    #####################################################################################
    custom_error = my_custom_error_that_does_some_postprocessing(pred_ray, test_y, some_metadata)
    evals_result['eval-custom_error'] = custom_error

    #####################################################################################
    # Report this custom error to tune, and do HPO based on this
    #####################################################################################
    tune.report(**evals_result)


def main(cpus_per_actor, num_actors, num_samples):
    # Set LightGBM config.
    config = {
        "objective": "binary",
        "metric": ["binary_logloss", "binary_error"],
        "eta": tune.loguniform(1e-4, 1e-1),
        "subsample": tune.uniform(0.5, 1.0),
        "max_depth": tune.randint(1, 9),
    }

    ray_params = RayParams(
        max_actor_restarts=1,
        gpus_per_actor=0,
        cpus_per_actor=cpus_per_actor,
        num_actors=num_actors)

    print('-'*10, 'Running Ray tune', '-'*10)
    analysis = tune.run(
        tune.with_parameters(train_breast_cancer, ray_params=ray_params),
        # Use the `get_tune_resources` helper function to set the resources.
        resources_per_trial=ray_params.get_tune_resources(),
        config=config,
        num_samples=num_samples,
        metric="eval-custom_error",
        mode="min",
        local_dir="./tune_results")

    # Load the best model checkpoint.
    best_bst = lightgbm_ray.tune.load_model(
        os.path.join(analysis.best_logdir, "tuned.lgbm"))

    best_bst.save_model("best_model.lgbm")

    accuracy = 1. - analysis.best_result["eval-binary_error"]
    print(f"Best model parameters: {analysis.best_config}")
    print(f"Best model total accuracy: {accuracy:.4f}")

Arindam_Jati · November 9, 2021, 12:40pm

Can someone please help me with this? Is it possible to achieve the above with the current implementation of lightgbm_ray? @Yard1 @rliaw

Yard1 · November 11, 2021, 7:27pm

You need to define a custom lightgbm metric (see their docs for more info) and then pass a TuneReportCheckpointCallback() from ray.tune.integration. lightgbm configured to use your metric in train(callbacks=[TuneReportCheckpointCallback()].

Arindam_Jati · November 12, 2021, 6:44am

@Yard1 Thank you so much. I will try it out.

Arindam_Jati · November 12, 2021, 6:53am

@Yard1 This can be a good workaround, but, is there some way to disable the train() functions’s implicit metric reporting to tune so that our explicit tune.report() is considered? This can be extremely useful for various scenarios. Thanks.

Yard1 · November 12, 2021, 7:51am

You can subclass that callback and make all of its functions do nothing

Topic		Replies	Views
Train.report, tune.report and session.report does not work with ray.train specifically xgboost_ray? how to report custom metrics to the SearchGenerator? Ray Train	1	509	February 3, 2023
Could not find best trial. Did you pass the correct `metric` parameter? Ray Tune	3	1466	December 17, 2021
TuneReportCallback is unable to read PyTorch Lightning metrics during Tuner.fit(...) Ray Tune	5	1343	January 27, 2023
Reporting custom validation and training metrics Ray Tune	6	1160	April 20, 2021
Tune Performance Issue with LightGBM predict Ray Tune	0	277	November 7, 2022

How to report custom metric to tune while using lightgbm_ray?

Related topics