Ray distributed memory parallelism

wrios · June 10, 2023, 3:54am

Hi,

When it comes to creating a ray cluster, is there a way to limit the number of workers per node? My remote function generates large amounts of data and I want to avoid having many threads on the same node.

rickyyx · June 15, 2023, 5:15pm

Hi @wrios - great question!

In general, each ray node will be started by ray start or ray.init(), and you could pass “num_cpus” to specify the logical number of CPUs that this node has. This will be equivalent to the maximal number of workers processes that Ray starts. (Do note that for each ray worker process, there might be multiple threads being used)

And, how are you creating the cluster? I could probbaly provide more specific help if I know that.

wrios · June 15, 2023, 5:42pm

Thank you for your interest in helping. To set up the cluster I am using a similar script as depicted in this example. Because a remote function can generate large amounts of data before it returns something, as occur in wave simulation where the history might need to be saved, it can fill up the node memory very easily if I have several ray workers in the same node. I was just wondering if there is a way to enforce ray workers to be distributed across nodes so as to avoid this to happen.

Justin_Coffi · October 20, 2023, 3:53pm

I’m just an amateur Ray user. But have you considered a centrally available location for worker spillover and/or consider enabling dask on ray? This way if the worker fills up, the information is dumped to disk and can be picked up again by the same worker (or another worker if the node or worker fails).

Topic		Replies	Views
Best way to config ray workers Ray Core	6	456	February 26, 2021
How does ray decide where to run a function? Ray Clusters	1	441	December 23, 2021
Ray Worker Max Memory Ray Core	3	521	February 5, 2021
Can Ray support more than 1000 nodes?	1	542	February 2, 2022
[Clusters] [Core] Head node max_workers is not respected Ray Clusters	2	291	June 17, 2021

Ray distributed memory parallelism

Related topics