How severe does this issue affect your experience of using Ray?
- Low: It annoys or frustrates me for a moment.
I’d like to inherit ParquetDatasource / FileBasedDatasource in a custom Datasource.
But FileBaseDatasource write() API requires dataset_uuid.
I get:
TypeError: FileBasedDatasource.write() missing 1 required positional argument: ‘dataset_uuid’
There isn’t a good way to call dataset.write_datasource(MyCustomDatasource(), …)
Since I need to access dataset._uuid which is intended to be private.
I can work around it by writing two lines:
dataset_uuid = dataset._uuid # hack
dataset.write_datasource(MyCustomDatasource(), …, dataset_uuid=dataset_uuid)
But its annoying to require all my users to do this in their flows:
dataset.map(…).map(…).write_datasource(MyCustomDatasource(), …, )
Any recommendations? I think write_datasource should pass dataset_uuid as a kwarg for all calls. Or provide a context/accessor to the Datasource constructor. Maybe I am missing something?