Offline inference vLLM: map_batches vs build_llm_processor

By default, Ray Data’s batch processing pipeline is not guaranteed to be deterministic—batch and sample ordering may vary between runs due to parallel execution and scheduling. However, you can enable deterministic ordering by setting preserve_order=True in the Ray Data execution context. This ensures that the order of batches and samples remains consistent across repeated runs (Ray Data reproducibility guide, Ray Discourse).

Would you like a code example for enabling deterministic execution?

Sources:

Hint: Mention @RunLLM in the post for followups.