When I increase the number of workers to n, the execution time is usually approximately 1/n of the execution time for a single Worker. However, in my experiment, when I increased the number of workers to 12 and then increased the number of workers, the execution time no longer dropped significantly. The machine I was running had 80 CPU cores, and the experimental code assigned one CPU core to each Worker. When I increased the amount of test data, this phenomenon still occurred when the number of workers was equal to 12. What causes this phenomenon? Is there a Worker process that cannot work in parallel?
The code is as follows:
ray.init(num_cpus=40)
start_time = time.time()
transformed_ds = ray.data.from_items(rgb_list, parallelism=12)
transformed_ds = transformed_ds.map(lambda index: get_tile_parallel_rgb_map(index, path_rgb, tile_offsets_rgb,
tile_byte_counts_rgb), concurrency=12)
result = transformed_ds.take_all()
end_time = time.time()
print("total time: " + str(end_time - start_time))