How to make sure that each mapping transformation task is running in parallel to get the best throutput?

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I am new to ray, and inspried by the image in the article, I was decided to use ray-data to implement pipeline parallelism for batch inference task. Pattern: Using pipelining to increase throughput — Ray 2.37.0

I split to inference task into multiple map transform processes, each task processes the data, via model inference or custom computing. Each task is supposed to run in parallelism so that Nth data row can be processed in task M while the N-1th data is processed in task M+1.

But I found that it sometimes make a execution plan that the next map transform task does not begin until the last map transform task comsumes all the data, which makes the pipeline running in low throutput.

So my question is, is there any configuration or settings to controll the execution plan for the batch infer task so that multiple data map transform tasks can be run in parallel to achieve the max throutput?