hi ,
i want to read data from hdfs but i got error that was Unable to load libhdfs . ı used modin . What can i do ?
thank you
cc @Alex Maybe good Dataset use case?
Maybe try Datasets: Distributed Arrow on Ray — Ray v2.0.0.dev0
You can pass in filesystem= to the read APIs to specify a Hadoop pyarrow filesystem: pyarrow.fs.HadoopFileSystem — Apache Arrow v5.0.0
hi @ericl ,
should ı install ray v2.0.0.dev0 ?
ı used arroy but ı got error .
AttributeError: module ‘pickle’ has no attribute ‘PickleBuffer’
cc @Alex can you answer what’s the best action to get around his issue?
Hey @murat, I don’t have an hdfs cluster handy, but what version of pyarrow do you have (pip freeze | grep pyarrow
)?
pip install cloudpickle should fix pickle error.
I think, it would be nice to include requirements.txt for Ray.data module.
@devin-petersohn will do!