Problem with anything on Ray

magic-dlg · April 12, 2022, 2:23pm

Hey!

At first, thanks for you time & efforts!
I have been struggling to make use of ray, and ray.data in particular.

I am using Ray via Domino datalab, where I can spin a cluster (here: 1x Head + 3x Workers “medium tier compute” 4cores 15gb RAM). Ray==1.9.2 .

Try to load 2.8GB parquet file into ray dataset and however I will play with it will crush and show different error. Typically if I wont connect to cluster I can run it in local mode. But once connected to cluster, Ray forces to use only one worker regardless if I use parameter “parallelism” in read_parquet. File is small enough to load it on any of these machines, but for some reason it will run out of RAM and kill worker.
I am really confused and not sure what went wrong.

Any suggestions ?

Clark_Zinzow · April 12, 2022, 3:24pm

Hi @magic-dlg, I think that you’re running into an old issue with Datasets around load balancing read tasks that was fixed a few months ago. Could you try using a Ray nightly wheel to confirm that this is the underlying issue?

magic-dlg · April 20, 2022, 1:24pm

Hey Clark,

Many thanks for suggestion. It helped and now it works.
Sorry for long delay, but holiday break + in corporate env things sometime can take very long.

Tested some vanilla code form ray page and Dataset and Train works so far. Yet still have some problem so run the default examples of Tune. I think i need to look for it separately.
thanks!

Topic		Replies	Views
Ray Data read Parquet loads all the data in one go	4	610	October 21, 2023
Cannot read parquet files Ray Data	2	651	April 19, 2023
Ray worker dies when reading multiple parquet files Ray Data	3	785	November 17, 2022
Read large webdatasets	1	72	October 22, 2024
Proper workflow to read local parquet file and use it on remote worker? Ray Data	13	1509	May 24, 2023

Problem with anything on Ray

Related topics