How to specify GPU resources in terms of GPU RAM and not fraction of GPU

I was trying to use the cluster resources feature to tell each node how much of a custom resource gpu_memory_mb it had, and then schedule my tasks based on their GPU RAM needs (same way I’m using memory= for regular RAM). However, I seem to have to give each task a num_gpus option > 0.0 for it to see the GPU at all.

I’ve been sticking in num_gpus=0.1, assuming that the GPU RAM resource specification will be the only one that matters, but it feels very hacky. I also have to specify the GPU RAM totals manually when launching the worker. Any chance this would be something Ray could support in the core API, similarly to the memory= option?

(The use case here is that I have limited access to a small number of GPUs with 32GB RAM, and easier access to some 16 GB GPUs. I’d like to tie them all together in a Ray cluster, but the GPU fraction required depends on which kind of node we’re talking about. Related to Gpu wise memory allocation )

However, I seem to have to give each task a num_gpus option > 0.0 for it to see the GPU at all.

Can you tell me a bit more about what this means? What do you mean by “see the GPU at all”?

To be honest, I’m not sure how it’s happening myself, but doing e.g. a cupy call will raise an exception indicating there are no cuda devices available.

I see. I guess it might be because CUDA_VISIBLE_DEVICES is probably not properly set. According to GPU Support — Ray v2.0.0.dev0, it seems like Ray automatically sets this env var when num_gpus is specified.

@rliaw is there usually a recommended way to use Ray with GPU memory configuration. Should he always specify small fraction of num_gpus?