- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
Firstly, greetings to everyone. This is my first newbie question. I am using Ray.Data to read a csv file, apply some map and filter functions, and materialize the results. Then i use iter_rows() to do post-processing. In this stage, i need to maintain the order of my input data. I see that Ray changes the order of data every time i re-run the code. Is there any way to indicate either to read_csv
or materialize
to maintain the input order. I am looking for any expert advice in this regard.
The work around i am thinking is to introduce an additional column to my input data (range of integers), and sort the dataset prior to materialize step. However, i assume that this is gonna to increase processing time as sorting is tedious.