Ray cannot detect GPU on databricks cluster

awoke101 · October 4, 2024, 7:17am

High: It blocks me to complete my task.

I am trying to run ray on databricks for chunking and embedding tasks. The cluster I’m using is:

g4dn.xlarge
1-4 workers with 4-16 cores
1 GPU and 16GB memory

I have set spark.task.resource.gpu.amount to 0.5 currently.

This is how I have setup my ray cluster:

setup_ray_cluster(
  min_worker_nodes=1,
  max_worker_nodes=3,
  num_gpus_head_node=1,
)

And this is the chunking function:

@ray.remote(num_gpus=0.2)
def chunk_udf(row):
    texts = row["content"]
    data = row.copy()
    split_text = splitter.split_text(texts)
    split_text = [text.replace("\n", " ") for text in split_text]
    return list(zip(split_text,data))

When I run the flat_map function for chunking. It throws the following error:

chunked_ds = ds.flat_map(chunk_udf)
chunked_ds.show(5)

At least one of the input arguments for this task could not be computed:
ray.exceptions.RaySystemError: System error: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

Is there something I need to change in my setup?
torch.cuda.is_available() returns True in the notebook.

awoke101 · October 4, 2024, 7:21am

I have also tried setting spark.task.resource.gpu.amount to 0 but it’s still throws the same error

Stephen_Offer · October 7, 2024, 7:35pm

hello @awoke101 can you share what you see with ray.cluster_resources()

Topic		Replies	Views
Can't use GPUs on local cluster Ray Clusters	3	659	September 11, 2024
Intentionally not using GPU Ray Core	3	396	February 9, 2022
Attempting to deserialize object on a CUDA device... error on 2 GPU machine Ray Tune	3	2964	April 6, 2021
Ray Blocking Spark Jobs Ray Clusters	3	25	March 11, 2025
`get_gpu_ids` is not empty but `torch.cuss.is_available` is false Ray Core	1	456	November 23, 2021

Ray cannot detect GPU on databricks cluster

Related topics