Does Ray Data support saving nested dictionaries of tensors to parquet?

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I’m trying to save a list of nested dictionaries of tensors to parquet, but I’m running into the following error:

pyarrow.lib.ArrowInvalid: ('Could not convert tensor([0., 0., 0., 0., 0.]) with type Tensor: did not recognize Python value type when inferring an Arrow data type', 'Conversion failed for column nested with type object')

Minimal reproducible example:

import ray
import torch
ds_test = ray.data.from_items([{"nested": {"tensor_a": torch.zeros(5), "tensor_b": torch.zeros(5)}}])
ds_test.write_parquet("local://test_parquet")

For now I can get around this by flattening my nested dictionary and re-creating it, but I’m wondering if there’s a better way to go about this.

EDIT: It works if I switch the torch tensors to numpy arrays instead. Does anyone know why?