Map Batches not using all CPUs

localh · October 25, 2023, 12:37pm

How severe does this issue affect your experience of using Ray?

Low: It annoys or frustrates me for a moment.

I have a TB-sized tabular data set that is going through these preprocessing functions:

    preprocessor = Chain(Concatenator(include=cat_cols, dtype=np.int32, output_column_name=['x_cat']),
                        Concatenator(exclude=["y", "x_cat"], dtype=np.float32, output_column_name=['x_cont'])
                        )

My cluster is a single node with 40 CPUs, 672 GB of RAM and 8x GPUs.

I cannot get Ray to use all the CPUs when processing the data. Fiddling with the object store memory, I observe the following use of CPUs:

# 10 gb -- 1 / 40 cpu
# default object store memory -- 9 / 40 cpu
# 610 gb eq to shm -- 27 / 40 cpu

How can I get Ray to use all CPUs while preprocessing data?

raulchen · October 26, 2023, 12:28am

By default, each map task uses 1 logical CPU. If you find your CPUs are not saturated, it may indicate that the actual task isn’t really using 1 CPU. Decreasing the CPU resource request will help, as there will be more tasks running concurrently.
To override this parameter, you can create your own Preprocessor subclass and override the _get_transform_config method. Simply make it return something like {"num_cpus": 0.5}

Topic		Replies	Views
Using a subset of available CPUs	2	496	December 12, 2020
Most efficient way to use only a CPU for training RLlib	3	3109	April 22, 2021
Single node, 4x GPU, map_batches only using 1 Ray Data	3	688	October 5, 2023
Ray 2.9.3: map_batches and multi-gpu -- not processing partition blocks / evenly sharding Ray Data	2	269	March 12, 2024
[Data] map_batches is not respecting concurrency from the beginning	1	187	December 6, 2024

Map Batches not using all CPUs

Related topics