Using Ray Plasma Effectively

Hi ,

We are using RaySGD dataset for our training ,parallel iterator reads data from our persistent store.

During every training epoch , iterator is fetching data again from persistent store. Ray Plasma is filled up to some extent.Instead of reading from persistent store often, iterator has to read most of the data from a plasma store. How to effectively achieve that.

1 Like

cc @rliaw Can you answer his question?

Maybe you could try using the Dataset API? The MLDataset API cc @Kai_Huang could also be useful here.

https://docs.ray.io/en/master/raysgd/raysgd_dataset.html

1 Like

@rliaw we are already using raysgd dataset API.

Thanks @rliaw I’ll look at it :slight_smile: