Docker ray doesn't recognise cuda gpu within container

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I am trying to run Ray with gpu support within a custom docker image and docker compose.

The Dockerfile I am using:

FROM rayproject/ray:latest-py39-cu121

WORKDIR /opt/project

USER root

RUN  sudo apt-get update && \
     apt-get install -y build-essential --no-install-recommends gcc git wget

CMD ["bash"]

The docker compose file I am using:

services:
    app:
        build:
            context: .
            dockerfile: Dockerfile
        
        container_name: test_ray
        image: test_ray
        volumes:
            - ./:/opt/project/
        tty: true
        stdin_open: true
        shm_size: 12gb
        runtime: nvidia
        environment:
          NVIDIA_VISIBLE_DEVICES: all
        deploy:
          resources:
            reservations:
              devices:
                - driver: nvidia
                  count: 1
                  capabilities: [gpu]

within the container, I can see that nvidia-smi recognises the gpu. However running ray.get_gpu_ids() returns an empty list.

I have tried the following base images with no luck:

  • rayproject/ray:latest
  • rayproject/ray:2.20.0.5708e7-py310-cu121
  • rayproject/ray-ml:latest

The commands I use:

  • docker compose build
  • docker compose run app bash
  • nvidia-smi I can see my cuda version being 12.2
  • python
  • import ray
  • print(ray.get_gpu_ids())