I am a starter to ray but failed on a VERY simple example

lzy37ld · January 26, 2025, 7:16am

I am trying to use Ray on a High-Performance computing managed by Slurm.

Here is my code, which is really simple…

import ray
ray.init()

@ray.remote
def f(x):
    return x * x

futures = [f.remote(i) for i in range(4)]
print(ray.get(futures)) # [0, 1, 4, 9]

I tried to run python test.py on the allocated node but it failed… Can anyone kindly help?
It just stuck at:

2025-01-26 02:10:34,453	INFO worker.py:1841 -- Started a local Ray instance.
E0126 02:10:37.663225099 1169908 thd.cc:157]                           pthread_create failed: Resource temporarily unavailable

Thanks for any help in advance!

christina · January 27, 2025, 6:24pm

Hi there!
It seems like you might be running into some resource allocation issues in Slurm. Here’s a few things you might wanna try out.

Set OMP_NUM_THREADS: If you are using libraries that utilize OpenMP (such as NumPy or SciPy), set the environment variable OMP_NUM_THREADS=1 before running your script. This limits the number of threads used by these libraries. Read more about it here: Install RLlib for Development — Ray 2.41.0
Increase Slurm Resources: Ensure that your Slurm job is requesting enough resources (CPUs, memory) to handle the workload. You might need to adjust your Slurm script to request more resources. What is the current resources you have in Slurm? (Here’s some docs you can read too: Debugging Memory Issues — Ray 2.41.0)
Try running some of the Ray debugging tools, like ray memory or ray stack to see if there’s any other errors running.

Just from what you described though my guess is that there might not be enough resources allocated so let me know what you find out after adjusting some settings!
Christina

Topic		Replies	Views
Issues with deploying Ray on Slurm: Ray Core	5	1300	March 3, 2021
Running ray on supercomputer with slurm Ray Core	4	706	August 4, 2021
Resources not available with Ray's multiprocessing Ray Core	4	353	March 11, 2021
Ray indicates that the request resource is insufficient Ray Clusters	0	652	December 19, 2022
Ray + slurm crashes early in run Ray Core	0	198	March 21, 2024

I am a starter to ray but failed on a VERY simple example

Related topics