Offline inference vLLM: map_batches vs build_llm_processor

RunLLM · January 24, 2026, 1:41pm

By default, Ray Data’s batch processing pipeline is not guaranteed to be deterministic—batch and sample ordering may vary between runs due to parallel execution and scheduling. However, you can enable deterministic ordering by setting preserve_order=True in the Ray Data execution context. This ensures that the order of batches and samples remains consistent across repeated runs (Ray Data reproducibility guide, Ray Discourse).

Would you like a code example for enabling deterministic execution?

Sources:

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
About the Ray Data LLM APIs category Ray Data LLM APIs	0	46	April 2, 2025
Does map_batches avoid saturating the inference engine? Ray Data LLM APIs	1	78	May 25, 2025
Ray Serve LLM APIs has 2~3x higher latency Ray Serve LLM APIs	7	405	May 19, 2025
Does RayData Support multi-node vllm inference Ray Data LLM APIs	2	497	May 23, 2025
vLLM Inferencing on multiGPU Ray Serve	7	1428	September 24, 2024

Offline inference vLLM: map_batches vs build_llm_processor

Related topics