How reliable are time series forecasts?

Editor
3 Min Read


How cross-validation, visualisation, and statistical hypothesis testing combine to reveal the optimal forecasting horizon

13 min read

15 hours ago

Photo by Nigel Tadyanehondo on Unsplash

Imagine you have a crystal ball — a mysterious family heirloom, handed down through generations. It shows its age, its clarity and lustre long gone, with some chips scattered across the surface.

Despite its hazy provenance, the things you see in it still seem to come true in one way or the other, at least in the short-term. It often shows you events far into the future, but how much can you trust it really?

The crystal ball I’m talking about here is of course our time series models, which we’ve built following the same approach underlying Meta’s Prophet suite. I’ve cheekily referred to my implementation as the False Prophet, but it looks like it’s anything but, producing what look to be fairly accurate forecasts (and I’ve got the cross-validation results to prove it).

Yet, it is only a model, and apart from usually being wrong, models also tend to struggle a bit at the extremes and edges; in this context, the extremities being forecasts far out into time.

In what follows, we’ll be building a time series model to predict UK road traffic accidents, exploring just how far we can push a time series model before it starts to break down. We’ll make use of cross-validation to maximise the utility of the data, building many models and diving into the out-of-fold residuals of each in order to see how the forecasting accuracy holds up over a forecasting horizon. We’ll take a look at how we can visually assess how well a model structure can forecast before taking a more statistical approach to determining the maximum “suitable” forecasting window. As a cherry-on-top, we’ll take a look at how decisions around data usage can impact the forecasting accuracy of a model.

That’s a lot, so let’s get cracking.

As always, we’ll be using real-world data. We’ll continue with UK road traffic accidents¹, summarised into a time series. It’s worth reminding ourselves of some of the characteristics of this time series:

Share this Article
Please enter CoinGecko Free Api Key to get this plugin works.