Is Ray suitable for low-latency, high-throughput business workflow orchestration with dynamic configurations?

We have a large-scale risk scoring platform with the following characteristics:

  • The workflow is a funnel model:

    1. Gather signals from hundreds of downstream APIs.

    2. Execute ML models dependent on these signals.

    3. Apply rules based on model scores and collected data.

    4. If more data is needed for the risk decision, trigger the next set of signal gathering → model execution → rules evaluation, and so on, forming a chain of conditional steps.

  • There are ~5,000 business-configurable workflows, which change frequently as new signals become available or scoring logic evolves.

  • Execution is currently managed by a Java orchestrator that handles highly conditional, dynamic workflow execution.

  • Most of the workflow time is spent on I/O (API calls) and waiting for model inference, not on CPU-heavy computation.

  • The system must handle very high throughput, including 500K requests/hour peak, and return results within ~150ms end-to-end latency.

Questions:

  1. Would Ray Compiled Graphs or Ray Durable Workflows be suitable for this type of I/O-heavy, low-latency, high-throughput workflow, where orchestration is complex and dynamic?

  2. Is modeling this as a DAG-based workflow the right mental approach, even though most of the execution time is I/O-bound?

  3. Is Ray primarily designed for business workflow orchestration like this, or is it more suitable for compute-heavy orchestration tasks such as ML model training, reinforcement learning, or batch scoring, where orchestration is minimal?

We are looking for guidance on whether Ray fits this use case or if a lightweight, in-memory orchestrator would be a better approach.