Handling Gaps in Time Series. Missingness analysis and evaluation… | by Erich Silva

Contents

Missingness analysis and evaluation methods for short and long sequences imputation Table of Contents The Goal of this Article

Missingness analysis and evaluation methods for short and long sequences imputation

Photo by Willian Justen de Vasconcellos on Unsplash

Time is the most well-defined continuum in physics and, hence, in nature. It should be of no surprise, then, the importance of continuity in time series datasets — a chronological sequence of observations.

This concept alone drives the motivation behind this article. Real-world datasets are susceptible to missing values for various reasons, such as faulty sensors, failures in data ingestion, or simply the absence of information during a given time. That, however, doesn’t change the underlying nature of the data-generating process of your features.

Understanding what caused those interruptions and analyzing and handling them in a time series dataset is, therefore, paramount to any subsequent task.

The Goal of this Article

After a comprehensive exploratory analysis of your time series, you might find that missing values are present to a considerable extent. By seeking an understanding of the nature of your data, you should be able to differentiate a gap that represents missingness from a gap that entails an actual interruption, characterizing it as an intermittent series.

This article will focus on the first scenario — analysis of missing values and methods to evaluate imputation results. While the actual techniques to perform imputation are many [1][2], I will elaborate on the…