I am using ray air TorchTrainer within my ray cluster. The dataset is created from a csv file and then multiple workers work with it, the issue is that when calling TorchTrainer it appears that the file is read all into memory by just one node. Is there anywhere to avoid loading the complete file into memory?
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
How to convert Pytorch torch.utils.data.Dataset to ray.data.dataset?
|
15 | 1399 | December 8, 2022 | |
Shared dataset on a local desktop
|
1 | 289 | March 7, 2023 | |
[Data][ray2.2.0] Out of Memory when using ray.data.from_torch | 0 | 503 | February 8, 2023 | |
Read large webdatasets
|
1 | 68 | October 22, 2024 | |
Slow Large-Scale Ingest w/Ray AIR (Ray Data + Ray Train)
|
20 | 1627 | July 28, 2022 |