`get_gpu_ids` is not empty but `torch.cuss.is_available` is false

Robur · November 11, 2021, 5:42pm

Hi,

I am new to ray and while trying to use it I faced a problem that my googling abilities failed to help me with.

The setup is the following. I am launching ray on a local machine in Docker initializing it with a single GPU and ten CPU cores.

One worker fails with

RuntimeError: No CUDA GPUs are available

having get_gpu_ids equals to [0] but torch.cuda.is_available() is false. Whereas there is one process with torch.cuda.is_available() equals to true.

Is this expected behavior? If yes, could please point to the manual so that I can more fully understand how ray works? If no, what can I do to avoid this error?

Launching on Ubuntu 18.04, PyTorch of version 1.9+cu111, ray of version 1.8.

Thanks.

sangcho · November 23, 2021, 3:27pm

cc @rliaw @sven1977 can you guys address this question?

Topic		Replies	Views
Intentionally not using GPU Ray Core	3	399	February 9, 2022
[Ray Core] RuntimeError: No CUDA GPUs are available Ray Core	5	4960	October 15, 2022
Can't work with pytorch's gpu tensor	0	25	September 20, 2024
Ray not finding available GPU on Windows RLlib	4	993	September 6, 2021
Can't use GPUs on local cluster Ray Clusters	3	669	September 11, 2024

`get_gpu_ids` is not empty but `torch.cuss.is_available` is false

Related topics