Converting IterableDataset to Ray iterator

We have an IterableDataset object in our application API that uses subclass torch.utils.data.IterableDataset(). I tried converting that to ray iterator so that the data read can happen in parallel. But it fails with the below error and the link provided also doesn’t exist.

ray.util.iter.from_iterators([data_loader.dataset])

TypeError: Could not serialize the argument <IterableDataset object at 0x7f67c405f070> for a task or actor ray.util.iter.ParallelIteratorWorker.__init__. Check https://docs.ray.io/en/master/serialization.html#troubleshooting for more information.

Any inputs on how the above can be achieved?

Thanks

@Clark_Zinzow Can you please help on the above

If IterableDataset is not serializable, then it cannot be used with from_iterators(), unfortunately. Also, parallel iterators and MLDataset is not under active development, so I would encourage you to try to use Ray Datasets instead!