Data for building global time series forecast model with deep learning

I enjoyed yesterday’s webinar on time series forecasting with Ray.

One question that I didn’t have time to ask re: building global deep learning model.

We have 10K targets for forecasting but the data granularity is monthly. We have 10 years of historical data, so each time series will have 120 rows. Is it still a good data to train as global model as panel data set format with deep learning? The success criteria is whether the global model generally gives us more accurate results than the statistical, univariate models.

Or should we consider lowering the data frequency to daily at least? I’d like to try with monthly data set; but like to see if someone has tried it for monthly forecasting and observed the model still performs well.

reference article:

Hi Yeonjoo,

Thank you, glad you enjoyed yesterday’s webinar! hmm, the problem I’ve seen in the past with monthly sales data, even if 10 years history, the majority of items will not have been sold continuously for 10 years. I don’t know if that is your case?

One way to make a small experiment - subset your data to just those items with steady, long historical sales data. Another tip, especially with sales data, make sure you encode any missing data as “missing” not 0 ! The easiest is just to leave out missing timestamps entirely from the data. In my github.com/christy code that I demo’d, that is covered by the allow_missing_timesteps=True part.

# convert pandas to PyTorch tensor
training_data = ptf.data.TimeSeriesDataSet(
    the_df[lambda x: x.time_idx <= training_cutoff],
    allow_missing_timesteps=True,

I’d suggest try this with your monthly data first. If you don’t see good results with a DL global model, next try weekly-aggregated. The smaller your data granularity gets, the more chance the data will be too sparse to be useful. From your question, I’m guessing you’re hesitant whether you have enough data for a daily model?

If you have any questions along the way, please feel free to contact me. Not sure if you see my contact info here? If not Charles can give you my direct info.

Thanks and good luck!
Christy

Thank you, Christy. Yes we’ll test monthly data first and test some tips you mentioned. I agree daily data will perform better most cases. Most of our existing data was prepared in monthly frequency including feature data sets since our users request monthly forecasts. We can test getting the data in weekly format and aggregate the forecast results to monthly and see if it works better than monthly forecast.

Yes, we have many cases the data is sparse; dealing it well is a key challenge.

I’ll reach out to you once we get to test building global models and have questions.