I used ray for resource scheduling in my training task. My training resource settings were 1 node, 4 GPUs, and 84 CPUs. However, during training, I received the following message:
(raylet) There are tasks with infeasible resource requests that cannot be scheduled. See Scheduling — Ray 2.52.1 for more details. Possible solutions: 1. Updating the ray cluster to include nodes with all required resources 2. To cause the tasks with infeasible requests to raise an error instead of hanging, set the ‘RAY_enable_infeasible_task_early_exit=true’. This feature will be turned on by default in a future release of Ray.
I used ray status:
======== Autoscaler status: 2025-12-05 10:52:58.474102 ========
Node status
Active:
1 headgroup
1 worker-group
Idle:
(no idle nodes)
Pending:
(no pending nodes)
Recent failures:
(no failures)
Resources
Total Usage:
5.0/84.0 CPU (4.0 used of 4.0 reserved in placement groups)
4.0/4.0 GPU (4.0 used of 4.0 reserved in placement groups)
0B/1.03TiB memory
1.47GiB/630.17GiB object_store_memory
From request_resources:
(none)
Pending Demands:
(no resource demands)
Is there a problem here? Why am I receiving this warning? How can I resolve it? Thank you.
The warning means Ray detected tasks or actors whose resource requirements cannot be satisfied by any node in your cluster—these are called “infeasible” requests. Even if your cluster appears to have available CPUs and GPUs, the specific resource combination requested by some tasks/actors may not match what any node can provide, so they remain unschedulable and hang. This is explained in the Ray scheduling resources documentation.
To resolve this, check the resource requirements (CPU, GPU, custom resources) specified for your tasks/actors and ensure they fit within the resources available on a single node, since Ray cannot split a single task/actor across multiple nodes. If you want infeasible tasks to error out instead of hanging, set the environment variable RAY_enable_infeasible_task_early_exit=true as suggested in the warning. Would you like more detail on how to debug which task is causing the issue or how to adjust your resource requests?
However, even with export RAY_enable_infeasible_task_early_exit=true set, the task doesn’t seem to exit. And the ‘/home/work/ray/session_2025-12-05_11-03-45_501767_1/logs/ray_process_exit.log’ file is empty.