All ray resources mapped to only two physical processors

This issue—where only a subset of CPU cores (e.g., cores 1 and 17) are fully utilized while others remain idle—often occurs when Ray tasks are not parallelized as expected, or when the underlying code is single-threaded or limited by environment variables controlling thread usage. In particular, if you are using libraries like PyTorch or NumPy, you may need to explicitly set the number of threads (e.g., with torch.set_num_threads(num_cpus)) to ensure all cores are used. Otherwise, Ray may schedule tasks across all CPUs, but the actual computation only uses a single core per task, leading to underutilization of the rest of the CPUs. This was confirmed as the root cause in a similar case, where setting torch.set_num_threads(num_cpus) resolved the issue on VMs, while leaving it at the default value caused only one core to be used despite Ray’s resource allocation settings. See the discussion for more details: Usage of CPU resource on RayCluster GCloud.

Would you like a step-by-step guide on how to diagnose and fix this in your environment?

Sources:

Hint: Mention @RunLLM in the post for followups.