Hi everybody,
I have a challenge, I have grown accustomed to the Pytorch dataset framework where to create a dataset I create a custom Pytorch dataset object. For my dataset which is generated using Nvidia omniverse, I have a directory structure holding all my data, I will have a folder of rgb images and a folder of segmentation images and a folder of jsons ect. When I build the dataset the pytorch way I initialize by creating a table of paths so when getitem is called I will lookup the paths for that item and load each piece of data accordingly. But I was swayed when learning of the parallelism, sharding, and abstractions (local filesystem or s3 for example) of the ray datasets. However, I don’t see a clear way to transform my directory-structured dataset into a ray dataset. If anybody has any suggestions on how to go about doing this I would deeply appreciate it!