NVIDIA GPU not deteted

Ray not detecting my GPU. Any clue what I should do?

$ nvidia-smi -L
GPU 0: NVIDIA GeForce GTX 1650 (UUID: GPU-9a194227-e6e5-7574-70df-22dbe4657f08)
>>> torch.cuda.is_available()
>>> torch.version.cuda
Traceback (most recent call last):
  File "simple_graph_heuristic_gnn.py", line 829, in <module>
    analysis = tune.run(
  File "/home/genesis/miniconda3/envs/gt/lib/python3.8/site-packages/ray/tune/tune.py", line 585, in run
  File "/home/genesis/miniconda3/envs/gt/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 627, in step
  File "/home/genesis/miniconda3/envs/gt/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 394, in _run_and_catch
  File "/home/genesis/miniconda3/envs/gt/lib/python3.8/site-packages/ray/tune/trial_executor.py", line 321, in on_no_available_trials
  File "/home/genesis/miniconda3/envs/gt/lib/python3.8/site-packages/ray/tune/trial_executor.py", line 301, in _may_warn_insufficient_resources
    raise TuneError(
ray.tune.error.TuneError: You asked for 3.0 cpu and 1.0 gpu per trial, but the cluster only has 12.0 cpu and 0 gpu. Stop the tuning job and adjust the resources requested per trial (possibly via `resources_per_trial` or via `num_workers` for rllib) and/or add more resources to your Ray runtime.

And this line does not throw ray off?

ray.init(num_cpus=12, num_gpus=1)

This line makes it run! I still don’t understand why I need to use all cpus for it to work.


I think the key is the num_gpus art. You should be able to change the num_gpus part to however many you want to allocate.