1. Severity of the issue: (select one)
Low: Annoying but doesn’t hinder my work.
2. Environment:
- Ray version: 2.52.1
- Python version: 3.12/3.13
- OS: MacOS/Linux
3. What happened vs. what you expected:
Here’s a minimal example of my problem I encountered in a more complex project.
I’m generating a series of tuples and want to sort them:
import ray
import ray.data
ray.init()
ds = ray.data.from_items(
[
(x,y) for y in range(3) for x in range(3)
]
)
ds.sort("item").show()
This minimal example results in an exception (see below) - no way around it. I don’t expect this behaviour, as one can sort tuples in native Python.
I’m not sure whether this failure is intended.
To sort this dataset, I added a column for each value of the tuple… and removed those after the sorting. Alternatively, I could generate the data in separate columns, but I actually want to consume a tuple in a later step.
Do you have any other suggestions?
If this behaviour is actually intended: It would be great to add this type of limitation to the ray.data documentation - or explain the sort logic, so the behaviour becomes intuitive.
Traceback (most recent call last):
File “/Users/achim/Code/ray-sort/./ray-sort-list.py”, line 12, in
ds.sort(“item”).show()
~~~~~~~~~~~~~~~~~~~~^^
File “/usr/local/Caskroom/miniforge/base/envs/ray-only/lib/python3.13/site-packages/ray/data/dataset.py”, line 3570, in show
for row in self.take(limit):
~~~~~~~~~^^^^^^^
File “/usr/local/Caskroom/miniforge/base/envs/ray-only/lib/python3.13/site-packages/ray/data/dataset.py”, line 3492, in take
for row in limited_ds.iter_rows():
~~~~~~~~~~~~~~~~~~~~^^
File “/usr/local/Caskroom/miniforge/base/envs/ray-only/lib/python3.13/site-packages/ray/data/iterator.py”, line 246, in _wrapped_iterator
for batch in batch_iterable:
^^^^^^^^^^^^^^
File “/usr/local/Caskroom/miniforge/base/envs/ray-only/lib/python3.13/site-packages/ray/data/iterator.py”, line 190, in _create_iterator
) = self._to_ref_bundle_iterator()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File “/usr/local/Caskroom/miniforge/base/envs/ray-only/lib/python3.13/site-packages/ray/data/_internal/iterator/iterator_impl.py”, line 27, in _to_ref_bundle_iterator
ref_bundles_iterator, stats = self._base_dataset._execute_to_iterator()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File “/usr/local/Caskroom/miniforge/base/envs/ray-only/lib/python3.13/site-packages/ray/data/dataset.py”, line 6586, in _execute_to_iterator
bundle_iter, stats, executor = self._plan.execute_to_iterator()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File “/usr/local/Caskroom/miniforge/base/envs/ray-only/lib/python3.13/site-packages/ray/data/exceptions.py”, line 89, in handle_trace
raise e.with_traceback(None) from SystemException()
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()