Handling large datasets results in error

Hmm, you shouldn’t need to directly call ray.put as tune.with_parameters should implicitly handle this for you.

I believe the original error you were seeing is a result of the same issue (the serialized function is too large). From the provided script, I can’t see anything obvious that would be large, but I might be overlooking something.

As a quick test, can you share what the output of this is?

from ray import cloudpickle as pickle

pickled = pickle.dumps(xgboost_hyper_param)
length_mib = len(pickled) // (1024 * 1024)
print(length_mib)