Pivot rolling window on time series data

mentics · June 29, 2023, 8:36pm

I have a dataset with two columns: a timestamp and a value. The values in the timestamp column are unique.

I want to transform it to a dataset with each row:

timestamp, value, lag(value, 1), lag(value, 2), lag(value, 3), ...

where lag(value, n) means the value column nth timestamps previous.
So (first column is actually a timestamp type, but… easier to write just numbers):
1,10
2,34
4,22
7,66
13,10
17,29

So, if window size was 4, then what I would like is:
7, 66, 22, 34, 10
13, 10, 66, 22, 34
17, 29, 10, 66, 22

It’s a large dataset, so I’d like to do it in a distributed (possibly lazy) way. How might that best be done?

Topic		Replies	Views
Map Dataset with Rolling Window Ray Data	1	291	April 22, 2024
[Ray Dataset] Shifting data/Lag features Ray Data	4	523	August 17, 2022
[Dataset] function add_column inserts repeats of sub-column instead of whole column Ray Data	2	426	November 30, 2022
[Datasets] Create custom dataset by grouping/merging existing blocks Ray Data	9	1296	November 30, 2022
Ray Data Map batches performance optimization Ray Data	2	244	August 1, 2024

Pivot rolling window on time series data

Related topics