Ray: Resource request cannot be scheduled — how to check CPU usage or actor resource allocation

Hello there,

I tried training a BC algorithm using offline data and enabled the RL module in the algorithm configuration. I ran the code on Google Colab, which only provides 2 CPUs, and encountered the following error:

The following resource request cannot be scheduled right now: {‘CPU’: 1.0}.
This is likely due to all cluster resources being claimed by actors.

If I disable the RL module, the BC algorithm runs without problems. However, when the RL module is enabled, the code gets stuck due to the CPU scheduling error above.

My question is:

  • How can I check which resources (CPUs, actors, etc.) are currently being scheduled or used?
  • How can I identify what might be causing this scheduling issue in my code?

Would really appreciate any suggestions or debugging tips!

My Environment

  • Platform: Google Colab (2 CPUs)
  • Ray version: 3.0.0.dev0
  • Python version: 3.10

My Code

import gymnasium as gym

from ray.rllib.algorithms.bc import BCConfig
from ray.rllib.core.rl_module.rl_module import RLModule, RLModuleSpec
from ray.rllib.core.testing.torch.bc_module import DiscreteBCTorchModule

config = (
BCConfig()
.api_stack(
enable_rl_module_and_learner=True,
enable_env_runner_and_connector_v2=True,
)
.environment(“CartPole-v1”)
.learners(num_learners=0)
.offline_data(
input_ = “/content/cartpole/large.json”,
input_read_method = “read_json”,
dataset_num_iters_per_learner=1,
)
.training(lr=0.00001, gamma=0.99, beta=0.0)
.rl_module(rl_module_spec=RLModuleSpec(module_class=DiscreteBCTorchModule))
.evaluation(
evaluation_interval=1,
evaluation_num_env_runners=1,
evaluation_duration=1,
)
)

algo = config.build()

result = algo.train()

Output Message

2025-10-17 07:46:34,505 INFO worker.py:1783 – Started a local Ray instance.
2025-10-17 07:46:43,961 INFO streaming_executor.py:108 – Starting execution of Dataset. Full logs are in /tmp/ray/session_2025-10-17_07-46-30_753240_7705/logs/ray-data
2025-10-17 07:46:43,961 INFO streaming_executor.py:109 – Execution plan of Dataset: InputDataBuffer[Input] → TaskPoolMapOperator[ReadJSON]

logger.warn(
2025-10-17 07:46:56,308 INFO trainable.py:161 – Trainable.setup took 25.799 seconds. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2025-10-17 07:46:56,309 WARNING util.py:61 – Install gputil for GPU system monitoring.
2025-10-17 07:46:56,323 INFO streaming_executor.py:108 – Starting execution of Dataset. Full logs are in /tmp/ray/session_2025-10-17_07-46-30_753240_7705/logs/ray-data
2025-10-17 07:46:56,323 INFO streaming_executor.py:109 – Execution plan of Dataset: InputDataBuffer[Input] → TaskPoolMapOperator[ReadJSON] → ActorPoolMapOperator[MapBatches(OfflinePreLearner)]
(autoscaler +52s) Tip: use ray status to view detailed cluster status. To disable these messages, set RAY_SCHEDULER_EVENTS=0.
(autoscaler +52s) Warning: The following resource request cannot be scheduled right now: {‘CPU’: 1.0}. This is likely due to all cluster resources being claimed by actors. Consider creating fewer actors or adding more nodes to this Ray cluster.
(autoscaler +1m27s) Warning: The following resource request cannot be scheduled right now: {‘CPU’: 1.0}. This is likely due to all cluster resources being claimed by actors. Consider creating fewer actors or adding more nodes to this Ray cluster.
(autoscaler +2m2s) Warning: The following resource request cannot be scheduled right now: {‘CPU’: 1.0}. This is likely due to all cluster resources being claimed by actors. Consider creating fewer actors or adding more nodes to this Ray cluster.

To check which resources (CPUs, actors, etc.) are currently being scheduled or used in Ray, you can use the ray.available_resources() and ray.nodes() functions in your Python code. These will show you the current logical resource usage and availability, helping you identify if all CPUs are claimed by actors or tasks. This is especially useful when you see scheduling warnings like “resource request cannot be scheduled right now” due to resource exhaustion. For more details, see the discussion at Too many actors cause worker is done.

Additionally, the Ray Dashboard (if available) provides a visual overview of resource usage, active actors, and pending tasks, which can help you debug resource bottlenecks. On Colab, you may not have access to the dashboard, so using the Python API is the best approach. If you want to see which actors are running, you can also use ray.util.state.list_actors() for more detailed actor state information. For more debugging tips, see Debugging Ray Data auto-scaling errors.

Would you like more detail on how to interpret these outputs or how to adjust your configuration to avoid resource deadlocks?

Sources:

Hint: Mention @RunLLM in the post for followups.