Enforcing CPU-only defaults while reserving specific tasks for GPU nodes in Ray cluster

RunLLM · November 24, 2025, 6:07pm

There is currently no built-in Ray feature to globally restrict scheduling of non-GPU tasks to only CPU nodes, or to make GPU nodes “opt-in” for only GPU-requiring tasks, without annotating every @ray.remote call or using custom resources/labels. The Ray scheduler prefers to use available nodes, so CPU-only tasks may be scheduled on GPU nodes if those nodes have available CPUs, unless you explicitly prevent it via resource requirements or by setting num_cpus=0 on GPU nodes (which, as you noted, is brittle and can break code expecting CPUs) [Ray GitHub Issue #47866] [Ray Discuss: How to prevent scheduling non-GPU tasks to GPU nodes].

Best practices today are: (1) use custom resources or node labels to tag CPU nodes and require those resources for CPU tasks, but this does require updating all relevant @ray.remote calls; (2) set num_cpus=0 on GPU nodes, but then you must use a custom resource for all tasks, which is not ideal; (3) use placement groups or node affinity for fine-grained control, but this also requires code changes. There is ongoing discussion and feature requests for a more Kubernetes-like taints/tolerations or “opt-in” scheduling model, but it is not yet implemented [Ray GitHub Issue #47866] [Ray Discuss: Heterogeneous clusters]. Would you like more detail or code examples for any of these approaches?

Sources:

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
How to prevent scheduling non-GPU tasks to GPU nodes Ray Core	6	300	September 30, 2024
How to set head pod NoSchedule Ray Core	5	1057	April 7, 2021
Can single task or actor (remote) run on multiple nodes? Ray Core	2	142	March 10, 2025
Assign specific nodes to remote functions in ray cluster Ray Clusters	8	2083	June 8, 2022
GPU Memory Aware Scheduling Ray Core	8	1265	March 12, 2024

Enforcing CPU-only defaults while reserving specific tasks for GPU nodes in Ray cluster

Related topics