How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
I have the following program that reads and writes images.
from ray.data import read_images
ds = read_images("s3://anonymous@ray-example-data/image-datasets/simple")
ds.write_images("/tmp/simple", column="image", file_format="png")
This is copied verbatim from the Saving Images section of the Ray Data documentation. When I run it I see the following error.
ray.exceptions.RayTaskError(AttributeError): ray::Write() (pid=18727, ip=127.0.0.1)
for b_out in map_transformer.apply_transform(iter(blocks), ctx):
File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/execution/operators/map_transformer.py", line 253, in __call__
yield from self._block_fn(input, ctx)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/planner/plan_write_op.py", line 45, in fn
block_accessors = [BlockAccessor.for_block(block) for block in blocks]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/planner/plan_write_op.py", line 45, in <listcomp>
block_accessors = [BlockAccessor.for_block(block) for block in blocks]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/execution/operators/map_transformer.py", line 253, in __call__
yield from self._block_fn(input, ctx)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/planner/plan_write_op.py", line 28, in fn
datasink_or_legacy_datasource.write(it1, ctx)
File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/datasource/file_datasink.py", line 128, in write
self.write_block(block_accessor, 0, ctx)
File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/datasource/file_datasink.py", line 197, in write_block
call_with_retry(
File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/util.py", line 986, in call_with_retry
raise e from None
File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/util.py", line 973, in call_with_retry
return f()
^^^
File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/datasource/file_datasink.py", line 194, in write_row_to_path
self.write_row_to_file(row, file)
File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/ray/data/_internal/datasource/image_datasink.py", line 21, in write_row_to_file
image = Image.fromarray(row[self.column])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/anaconda3/envs/pipeline/lib/python3.11/site-packages/PIL/Image.py", line 3304, in fromarray
arr = obj.__array_interface__
^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute '__array_interface__'
The problem appears to be that the ImageDatasink
expects the image data to be in a numpy array, however it is a list of list of integers.
Because I am copying this code verbatim from the documentation I don’t see how I could have made a mistake. What is going on?
- ray 2.38.0
- Python 3.11.10
- Mac OS X 15.1