Incorrect resource identification


I am starting ray head node but I do not want the head node to see GPU, so I start ray head node as:

ray start --head --num-cpus 1 --num-gpus 0

I still see that head node comes up with GPU detected on the node:

1 node(s) with resources: {‘node:’: 1.0, ‘memory’: 354943174042.0, ‘object_store_memory’: 156404217446.0, ‘CPU’: 1.0, ‘accelerator_type:V100’: 1.0}

is there a way where we can avoid ray to auto detect resources?

Are you using the autoscaler for this? I think you are seeing this issue; Autoscaler does not respect --num-cpus argument to `ray start` · Issue #13270 · ray-project/ray · GitHub

cc @Ameer_Haj_Ali

The setup that I have is: I start head node and ask 3 worker nodes to connect to head node.
I am not sure in my current setup if autoscaler is used… I see that it is auto detecting GPU when I do not want head node to detect GPU, I do see CPU set to 1 in my setup

@Dmitri any guess why this happens? Can you help diagnosing his issue?

the “GPU” resource was correctly overwritten to 0, but the “accelerator type” resource which is meant to aid in the scheduling of particular tasks onto machines with specific types of nvidia gpus was not removed.

This is a bug. The behavior is confusing, but harmless.
Will file an issue tomorrow.

Also, there is an API to override resource autodetection — will look that up and get back to you tomorrow.

1 Like

I think this is a good first issue for @mwtian to work on as well.

Accelerator annotation issue tracked here: [core] Zero-gpu node shouldn't be marked with accelerator_type resource. · Issue #15878 · ray-project/ray · GitHub
Thanks for pointing this out!

There’s no way to turn off resource detection.
However, you can use the command line arguments memory, num_cpus, num_gpus, and resources to override autodetected resources: