Custom FileBasedDatasource requires dataset_uuid

How severe does this issue affect your experience of using Ray?

  • Low: It annoys or frustrates me for a moment.

I’d like to inherit ParquetDatasource / FileBasedDatasource in a custom Datasource.
But FileBaseDatasource write() API requires dataset_uuid.

I get:
TypeError: FileBasedDatasource.write() missing 1 required positional argument: ‘dataset_uuid’

There isn’t a good way to call dataset.write_datasource(MyCustomDatasource(), …)
Since I need to access dataset._uuid which is intended to be private.

I can work around it by writing two lines:
dataset_uuid = dataset._uuid # hack
dataset.write_datasource(MyCustomDatasource(), …, dataset_uuid=dataset_uuid)

But its annoying to require all my users to do this in their flows:…).map(…).write_datasource(MyCustomDatasource(), …, )

Any recommendations? I think write_datasource should pass dataset_uuid as a kwarg for all calls. Or provide a context/accessor to the Datasource constructor. Maybe I am missing something?