Create Ray dataset from numy array

What is the correct way to convert numpy array into ray’s dataset? I tried the following, but its not working.

import numpy as np
arr = np.array([[1.9, 1.0, 1.1, 0.9],
         [1.8, 1.0, 1.1, 0.9],
         [1.7, 1.0, 1.2, 0.8],
         [1.6, 1.0, 1.3, 0.9],
         [1.5, 1.0, 1.4, 0.9]])
print(arr)
ds=ray.data.from_numpy(arr)
ds.show(10)

Numpy array is not stored in ray dataset correctly. It is saved as

{'value': array(1.9)}
{'value': array(1.)}
{'value': array(1.1)}
{'value': array(0.9)}
{'value': array(1.8)}
{'value': array(1.)}
{'value': array(1.1)}
{'value': array(0.9)}
{'value': array(1.7)}
{'value': array(1.)}

Thanks in advance

Hi @pratap123, that API actually takes a list of NumPy ndarray futures instead of a single NumPy ndarray, so I think that it’s treating your ndarray as if it’s a list of futures, which is why you’re getting the weird result.

If you try

ds = ray.data.from_numpy([ray.put(arr)])

this should work as expected!

I’ll open a PR fixing this so ray.data.from_numpy() is more intuitive.