How to test if tensorflow can see GPU on worker node using Ray?

Hey folks,

Was playing around and trying to get up and running with Tensorflow. I found the following script helpful to make sure everything was up and running and i wanted to share it with the community.

import ray


def test_tf():
    import tensorflow as tf
    return tf.config.list_physical_devices("GPU")

print(ray.get([test_tf.remote() for x in range(10)]))

my yaml looks like the following:

min_workers: 1
max_workers: 1

    image: anyscale/ray-ml:latest-cpu
    head_image: anyscale/ray-ml:latest-cpu
    worker_image: anyscale/ray-ml:latest-gpu
    container_name: ray_container
    pull_before_run: False

head_setup_commands: []

    - pip install -U ray
    - pip install tensorflow

    - apt-get install -y libcudnn7= libcudnn7-dev=

idle_timeout_minutes: 5

    type: aws
    region: us-west-2
    availability_zone: us-west-2a

    InstanceType: p2.xlarge

The above worked great for me and allowed me to easily debug my application. Please let us know if it doesn’t work for you!

1 Like

Thanks a lot for the script! The first part worked for me! What type of the output did you get as a result of print(ray.get([test_tf.remote() for x in range(10)]))? I am in the process of debugging, and I get [[], [], [], [], [], [], [], [], [], []].

You should bet some output not just blank lists IIRC. If it’s blank [like your output], that means you can’t see a device!