Handling large datasets results in error

ashwini · October 5, 2021, 6:21am

I have a large dataset. I’m using tune.with_parameters to pass the dataset to trainable function. Here is the code for tuning
trainable function

def xgboost_hyper_param(config, data = None):

#max_depth = int(max_depth)
#params = {'max_depth' : max_depth, 'learning_rate': learning_rate, 'gamma': gamma}
trainX, trainY, validX, validY, tr_groups, val_groups = data

train_dmatrix = xgb.DMatrix(trainX[feature_names], trainY, feature_names = feature_names)
valid_dmatrix = xgb.DMatrix(validX[feature_names], validY, feature_names = feature_names)
train_dmatrix.set_group(tr_groups)
valid_dmatrix.set_group(val_groups)


params = config
params['max_depth'] = int(params['max_depth'] )
params['tree_method'] = 'gpu_hist'
params['objective'] = OBJECTIVE
params['learning_rate'] = LEARNING_RATE


model = xgb.train(params, train_dmatrix, num_boost_round = N_ESTIMATORS, maximize = True,
                     evals=[(valid_dmatrix, 'eval')], feval = sharpe_metric, 
                     verbose_eval = False, early_stopping_rounds = 30, 
                     callbacks=[TuneReportCallback({"mean_sharpe": "eval-sharpe"})])

tuning function

algo = TuneBOHB( metric="mean_sharpe", mode="max", seed = 101)
bohb = HyperBandForBOHB(time_attr='training_iteration',
                        metric='mean_sharpe',
                        mode='max',
                        max_t=500,
                        reduction_factor=3,

        )

analysis = tune.run(
                    tune.with_parameters( xgboost_hyper_param, 
                                         data = (trainX, trainY, validX, validY, 
                                                 tr_groups, val_groups)),
                    #metric = "mean_sharpe",
                    #mode = "max",
                    name = f"run_{round_number}",
                    resources_per_trial={"cpu": 4, "gpu":0.1},
                    config=var_space,
                    num_samples=500,
                    local_dir = f'{root_dir}/logs/pairwise',
                    search_alg = algo,
                    scheduler = bohb,
                    #resume = resume

            )

The above setup was working fine till now. The dataset has changed and is much bigger in size. The same setup is giving me the error " ConnectionError: Error 104 while writing to socket. Connection reset by peer."

I have tried to use ray.put to put data in ray object storage. I added following lines to the code


    ray.put(trainX[feature_names])
    ray.put(validX[feature_names])
    ray.put(trainY)
    ray.put(validY)
    ray.put(tr_groups)
    ray.put(val_groups)

Now, I’m getting error “ValueError: The actor ImplicitFunc is too large (795 MiB > FUNCTION_SIZE_ERROR_THRESHOLD=95 MiB)”.

Can someone here, please help me in resolving the problem?

matthewdeng · October 5, 2021, 6:58am

Hmm, you shouldn’t need to directly call ray.put as tune.with_parameters should implicitly handle this for you.

I believe the original error you were seeing is a result of the same issue (the serialized function is too large). From the provided script, I can’t see anything obvious that would be large, but I might be overlooking something.

As a quick test, can you share what the output of this is?

from ray import cloudpickle as pickle

pickled = pickle.dumps(xgboost_hyper_param)
length_mib = len(pickled) // (1024 * 1024)
print(length_mib)

ashwini · October 5, 2021, 11:12am

It is 405 mb. The custom metric function in xgboost had a global variable (pd.DataFrame) leading to big size. Thanks for the help.

matthewdeng · October 5, 2021, 11:06pm

Nice! Were you able to move/unlink the global variable and resolve the original issue?

ashwini · October 6, 2021, 4:50am

Yes, had to use partials for custom metric.

Topic		Replies	Views
StatusCode.RESOURCE_EXHAUSTED Ray Tune	21	5168	April 11, 2023
XGBoost + large data with Ray Tune: How? Ray Data	0	24	July 29, 2025
Error in Colab: ImplicitFunc is very large and grpc_status”:8 Ray Tune	1	873	February 8, 2022
Again a "The actor ImplicitFunc is too large error" Ray Tune	2	2513	June 22, 2023
Tuning fails with "The actor ImplicitFunc is too large" Ray Tune	2	1237	September 1, 2021

Handling large datasets results in error

Related topics