Ray ignores first gpu in multi-gpu environment?

Dvir_Ginzburg · December 6, 2020, 1:19pm

I have a local server with 8 GPUs.
No matter the configuration I’m using, ray is always ignoring the first GPU
ray.init(local_mode=False, num_cpus=10, num_gpus=8)

Ideas?

sangcho · December 8, 2020, 6:12am

Ray scheduler doesn’t assign the GPU to each worker, but it is just for the scheduling. My guess is your application code is not properly using the first GPU for some reason. Are you using Tune or Rllib?

Dvir_Ginzburg · December 8, 2020, 6:40am

Thanks for the response. I’m using tune. When running in local mode Ray uses the first gpu. When I’m not using Ray at all the first gpu is chosen and runs properly.

sangcho · December 10, 2020, 9:42am

cc @rliaw Do you know what’s the cause of this issue?

rliaw · December 10, 2020, 7:47pm

No, I don’t unfortunately. Maybe you can look at the dashboard to see what’s going on?

Topic		Replies	Views
Ray worker GPU count if GPU available Ray Core	2	902	August 1, 2022
GPU accelarate that can not be used with ray and tune in training PPO RLlib	3	854	December 23, 2023
[Tune] Ray tune for multi gpu and multi node runs Hangs	2	597	August 26, 2023
Can we make ray evenly schedule tasks on different GPUs? Ray Core	3	308	January 11, 2021
Ray tune with environment using GPU RLlib	2	841	February 8, 2021

Ray ignores first gpu in multi-gpu environment?

Related topics