Map Batches not using all CPUs

How severe does this issue affect your experience of using Ray?

  • Low: It annoys or frustrates me for a moment.

I have a TB-sized tabular data set that is going through these preprocessing functions:

    preprocessor = Chain(Concatenator(include=cat_cols, dtype=np.int32, output_column_name=['x_cat']),
                        Concatenator(exclude=["y", "x_cat"], dtype=np.float32, output_column_name=['x_cont'])

My cluster is a single node with 40 CPUs, 672 GB of RAM and 8x GPUs.

I cannot get Ray to use all the CPUs when processing the data. Fiddling with the object store memory, I observe the following use of CPUs:

# 10 gb -- 1 / 40 cpu
# default object store memory -- 9 / 40 cpu
# 610 gb eq to shm -- 27 / 40 cpu

How can I get Ray to use all CPUs while preprocessing data?

By default, each map task uses 1 logical CPU. If you find your CPUs are not saturated, it may indicate that the actual task isn’t really using 1 CPU. Decreasing the CPU resource request will help, as there will be more tasks running concurrently.
To override this parameter, you can create your own Preprocessor subclass and override the _get_transform_config method. Simply make it return something like {"num_cpus": 0.5}

1 Like