Docker ray doesn't recognise cuda gpu within container

stephano41 · April 30, 2024, 2:16pm

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

I am trying to run Ray with gpu support within a custom docker image and docker compose.

The Dockerfile I am using:

FROM rayproject/ray:latest-py39-cu121

WORKDIR /opt/project

USER root

RUN  sudo apt-get update && \
     apt-get install -y build-essential --no-install-recommends gcc git wget

CMD ["bash"]

The docker compose file I am using:

services:
    app:
        build:
            context: .
            dockerfile: Dockerfile
        
        container_name: test_ray
        image: test_ray
        volumes:
            - ./:/opt/project/
        tty: true
        stdin_open: true
        shm_size: 12gb
        runtime: nvidia
        environment:
          NVIDIA_VISIBLE_DEVICES: all
        deploy:
          resources:
            reservations:
              devices:
                - driver: nvidia
                  count: 1
                  capabilities: [gpu]

within the container, I can see that nvidia-smi recognises the gpu. However running ray.get_gpu_ids() returns an empty list.

I have tried the following base images with no luck:

rayproject/ray:latest
rayproject/ray:2.20.0.5708e7-py310-cu121
rayproject/ray-ml:latest

The commands I use:

docker compose build
docker compose run app bash
nvidia-smi I can see my cuda version being 12.2
python
import ray
print(ray.get_gpu_ids())

Topic		Replies	Views
Ray Serve container runtime_env cannot use GPU Ray Serve	3	772	December 6, 2023
`get_gpu_ids` is not empty but `torch.cuss.is_available` is false Ray Core	1	457	November 23, 2021
Replicas can't connect to GPUs Ray Serve	9	1623	August 11, 2022
Ray GPU Image for Cuda 10.1 Kubernetes	1	484	February 5, 2021
Ray jobs stuck pending in docker container when using GPU on mnist example Ray Tune	4	949	July 12, 2021

Docker ray doesn't recognise cuda gpu within container

Related topics