- Medium: I need to know how to do this.
I am trying to process a batch of data using a Ray cluster. I’m trying to understand how to get a list of workers in my cluster, so I can emulate this example:
# they have a preset number of workers, I need to use the number of workers in my cluster workers = [Worker.remote(i) for i in range(4)] ds = ray.data.range(10000) # -> Dataset(num_blocks=200, num_rows=10000, schema=<class 'int'>) shards = ds.split(n=4) # -> [Dataset(num_blocks=13, num_rows=2500, schema=<class 'int'>), # Dataset(num_blocks=13, num_rows=2500, schema=<class 'int'>), ...] ray.get([w.train.remote(s) for w, s in zip(workers, shards)])
I was also wondering how handle elasticity. I.e, spot instances would work with this.
E.g, if a spot instance shuts down, I don’t want to drop a particular shard of the dataset.