How to get list of workers in Ray cluster?

Vedant_Roy · October 6, 2022, 10:33pm

Medium: I need to know how to do this.

I am trying to process a batch of data using a Ray cluster. I’m trying to understand how to get a list of workers in my cluster, so I can emulate this example:

# they have a preset number of workers, I need to use the number of workers in my cluster
workers = [Worker.remote(i) for i in range(4)]

ds = ray.data.range(10000)
# -> Dataset(num_blocks=200, num_rows=10000, schema=<class 'int'>)

shards = ds.split(n=4)
# -> [Dataset(num_blocks=13, num_rows=2500, schema=<class 'int'>),
#     Dataset(num_blocks=13, num_rows=2500, schema=<class 'int'>), ...]

ray.get([w.train.remote(s) for w, s in zip(workers, shards)])

I was also wondering how handle elasticity. I.e, spot instances would work with this.
E.g, if a spot instance shuts down, I don’t want to drop a particular shard of the dataset.

Alex · October 8, 2022, 1:04am

You could most immediately name your actors and use ray.util.list_names_acrtors to list them, but it sounds like you may have a different underlying question?

If that’s not enough, could you provide more context about your usecase?

Vedant_Roy · October 8, 2022, 10:03pm

Hi Alex, thanks for the response.
The thing I wanted to do was, run some number of workers for each node in my cluster, but it looks like Ray will handle scheduling tasks for me, (i.e it will automatically spawn 1 task per CPU core).

Topic		Replies	Views
Ray Data Performance Issues Ray Data	1	520	January 25, 2022
Reading number of Ray workers from Python? Ray Core	1	284	November 2, 2020
Best way to config ray workers Ray Core	6	451	February 26, 2021
Ray cluster didn't use all the available CPU nodes Ray Clusters	1	552	February 16, 2024
Ray distributed memory parallelism Ray Core	3	448	October 20, 2023

How to get list of workers in Ray cluster?

Related topics