Example Image Writing Code: 'list' object has no attribute '__array_interface__'

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I have the following program that reads and writes images.

from ray.data import read_images

ds = read_images("s3://anonymous@ray-example-data/image-datasets/simple")
ds.write_images("/tmp/simple", column="image", file_format="png")

This is copied verbatim from the Saving Images section of the Ray Data documentation. When I run it I see the following error.

ray.exceptions.RayTaskError(AttributeError): ray::Write() (pid=18727, ip=127.0.0.1)
    for b_out in map_transformer.apply_transform(iter(blocks), ctx):
  File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/execution/operators/map_transformer.py", line 253, in __call__
    yield from self._block_fn(input, ctx)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/planner/plan_write_op.py", line 45, in fn
    block_accessors = [BlockAccessor.for_block(block) for block in blocks]
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/planner/plan_write_op.py", line 45, in <listcomp>
    block_accessors = [BlockAccessor.for_block(block) for block in blocks]
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/execution/operators/map_transformer.py", line 253, in __call__
    yield from self._block_fn(input, ctx)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/planner/plan_write_op.py", line 28, in fn
    datasink_or_legacy_datasource.write(it1, ctx)
  File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/datasource/file_datasink.py", line 128, in write
    self.write_block(block_accessor, 0, ctx)
  File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/datasource/file_datasink.py", line 197, in write_block
    call_with_retry(
  File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/util.py", line 986, in call_with_retry
    raise e from None
  File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/util.py", line 973, in call_with_retry
    return f()
           ^^^
  File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/datasource/file_datasink.py", line 194, in write_row_to_path
    self.write_row_to_file(row, file)
  File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/datasource/image_datasink.py", line 21, in write_row_to_file
    image = Image.fromarray(row[self.column])
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/PIL/Image.py", line 3304, in fromarray
    arr = obj.__array_interface__
          ^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute '__array_interface__'

The problem appears to be that the ImageDatasink expects the image data to be in a numpy array, however it is a list of list of integers.

Because I am copying this code verbatim from the documentation I don’t see how I could have made a mistake. What is going on?

  • ray 2.38.0
  • Python 3.11.10
  • Mac OS X 15.1

This doesn’t appear to have anything to do with the fact that I’m working with images. When I run the following:

from numpy import array
from ray.data import from_items

ds = from_items([{"data": array([1, 2, 3])}])
print(type(ds.take()[0]["data"]))

It prints “<class ‘list’>”.

Is Ray data supposed to convert all numpy arrays to lists? I don’t see any mention of this in the documentation.

Everything works as expected if I back up to Ray 2.31.0. This appears to be a regression.

Never mind, I had an odd version of numpy installed.

numpy @ file:///private/var/folders/k1/30mswbxs7r1g6zwn8y4fyt500000gp/T/abs_a51i_mbs7m/croot/numpy_and_numpy_base_1708638620867/work/dist/numpy-1.26.4-cp311-cp311-macosx_11_0_arm64.whl#sha256=3d90dd3382cff7becb2384f73058a8e72b81c697e8bb77f1c69a82caca5b0c57

I don’t know how that got on there.

I ran this in a fresh environment with Ray 2.39.0 and Numpy 2.1.3 and everything worked as expected.