Hi,
I try to use the CustomKBinsDiscretizer provided by ray to perform feature binning on my data, but I get wrong results(eg:0.25488 should belong to 7 bin,but get 6 bin).
Below is my sample code:
import pandas as pd
import ray
from ray.data.preprocessors import
df = pd.DataFrame(
pd.Series([0.25488, -1.14293, -1.45107, -0.87993, 0.42676, 0.96310, -0.66250, -0.45334, -0.60658, -0.58381,
-0.10751, -0.48234, 0.74152, -0.95448, -0.35601, -0.91099, 0.86991, 1.88669, 1.77901]),
columns=['a'])
ds = ray.data.from_pandas(df)
discretizer = CustomKBinsDiscretizer(
columns=["a"],
bins={'a': [-1.04767, -0.784675, -0.612797, -0.46991, -0.26904, -0.053674, 0.2321, 0.840923, 1.53672,
4.094189]},
)
discretizer_data = discretizer.transform(ds).to_pandas()
print(discretizer_data)