Is there a plan to support NPU as a backend in the accelerate DAG?

1. Severity of the issue: (select one)
None: I’m just curious or want clarification.

2. Environment:

  • Ray version: 2.48.0
  • Python version: python3.9
  • OS:

3. What happened vs. what you expected:

I have several NPU cards and hope to use the Compiled Graph feature on VLLM to replicate PP parallelism. I would like to know if the community currently has plans to implement NPU as a backend for aDAG.

1 Like

Hey I think you might already be able to use compiled graphs with NPU’s, see [Compiled Graph] Enhance Compile Graph with Multi-Device Support by hipudding · Pull Request #51032 · ray-project/ray · GitHub and [CG, Core] Add Ascend NPU Support for RCCL and CG · Issue #51574 · ray-project/ray · GitHub . There’s also some ongoing work to support them with ray collectives and the new Ray Direct Transport feature - [WIP] [core] [object store] Support HCCL as a tensor transport backend with object store by Kishanthan · Pull Request #55428 · ray-project/ray · GitHub, https://github.com/ray-project/ray/pull/55381

As of now, there is no official support for NPU as a backend in Ray Compiled Graphs (aDAG/Compiled Graph) in the mainline Ray documentation or API references. The official docs and API guides do not mention NPU support—only GPU (CUDA/NCCL) and CPU transports are described, and the current transport options are “auto”, “cpu”, “accelerator”, “nccl”, and “shm” for tensor transport, with “accelerator” referring to GPU/XLA or vendor-provided backends, but not specifically NPU. The PRs and issues you referenced (e.g., #51032, #51574, #55428, #55381) indicate ongoing community and contributor interest, but these features are not yet documented as available or stable in the official Ray releases or user guides.

If you are interested in NPU support, you may want to follow those PRs/issues for updates or contribute to the discussion. For now, Ray Compiled Graphs are officially supported for GPU and CPU, and NPU support is experimental or under development, not part of the stable API or documentation. Would you like more detail on the current state of accelerator support in Ray Compiled Graphs?

Sources:

Hint: Mention @RunLLM in the post for followups.