Ray_node_cpu_count mismatches resource constraint

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

When enabling the Ray metrics, I found that the metric ray_node_cpu_count shows all physical CPUs on the node, despite setting num_cpus in ray.init().

I am wondering if it is as expected, or if the ray_node_cpu_count only concentrates on the metric of the physical node where Ray is running.

Hey @yzs thank you for your question, the current ray_node_cpu_count is indeed expected to show ALL the CPUs on the node, rather than a logical view of the ray cluster configured with num_cpus

We are currently working on better metrics exported, so expect this to be fixed in recent releases (2.1: mid Oct or 2.2: end of year)

Created an issue to track here: [core][metric] Logical resource view · Issue #28799 · ray-project/ray · GitHub

Thanks for your quick response! I will keep an eye on the issue.