XGBoostTrainer -- Distributed Weights Not Working?

localh · December 7, 2023, 2:27pm

When trying to use the weights for XGBoostTrainer, like:

train_weights_ds = train_set.select_columns(['weight'])

trainer = XGBoostTrainer(
    scaling_config=ScalingConfig(
        num_workers=16,
        use_gpu=True,
    ),
    early_stopping_rounds=10,
    dmatrix_params={"train": {'weight': train_weights_ds}, },
    ...
)

I am met with data size mismatches, suggesting that weights are not being sharded in line with each shard to sent to each worker. Is it possible to attach weights to each worker?

Check failed: weights_.Size() == num_row_ (92711999 vs. 3862999) : Size of weights must equal to number of rows.

matthewdeng · December 7, 2023, 7:25pm

Can you use the weight column name instead?

-    dmatrix_params={"train": {'weight': train_weights_ds}, },
+    dmatrix_params={"train": {'weight': 'weight'}, },

localh · December 7, 2023, 8:23pm

Works great, thanks!

Will1 · September 3, 2024, 7:37pm

Hi Matt, I tested this and got message saying dmatrix_params is deprecated and suggested me to use dataset_config instead to customize Ray Dataset ingestion. Can you advise how to use dataset_config to assign the weight column?

matthewdeng · September 5, 2024, 4:03am

Hey @Will1 thanks for pointing this out, looks like this was accidentally removed.

I’ve put up a PR that will add this back.

A workaround for current Ray versions would be to use the V2 API, which would allow you to customize your XGBoost training code more flexibly!

Will1 · September 5, 2024, 8:46pm

Hi @matthewdeng , thanks for your response and generating the PR for the future release. Do you mind giving me an example about how to use the V2 API to correctly set the observation weight column?

matthewdeng · September 12, 2024, 11:30pm

Hey @Will1, following up on this it turns out my PR actually would not solve the problem, but the V2 API would.

The way to do so is to update these lines and pass in the weight parameter there, which at that point should be a column in train_df or eval_df.

Will1 · September 13, 2024, 2:37pm

Thanks for the follow-up. This is so helpful!

Topic		Replies	Views
XGBoostTrainer access to indices of data in Ray Dataset Ray Train	0	91	April 12, 2024
Understanding distributed data loading and training xgboost ray Ray Data	10	963	July 19, 2023
Distributed data loading using Ray Data with XGBoost official (or XGBoost Sklearn) model	1	313	August 26, 2022
XGBoostTrainer crashes with ActorDiedError when using num_workers > 1 and use_gpu=False Ray Train	0	8	May 18, 2025
[Ray Train] XGBoostTrainer crashes with ActorDiedError when using num_workers > 1 and use_gpu=False Ray Train	0	15	May 26, 2025

XGBoostTrainer -- Distributed Weights Not Working?

Related topics