Missing data imputation in multivariable time series data

Authors

Daneshpour, Negin Shahid Rajaee Teacher Training University

mirabolghasemi, Seyedeh fatemeh Shahid Rajaee Teacher Training University

Abstract:

Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of different techniques for time series missing data imputation, which usually include simple analytic methods and modeling in specific applications or univariate time series. In this paper, a hybrid approach to obtain missing data is proposed. An improved version of inverse distance weighting (IDW) interpolation is used to missing data imputation. The IDW interpolation method has two major limitations: 1) finding closest points to missing data 2) Choosing the optimal effect power for missing data neighbors. Clustering has been used to remove the first constraint and find closest points to the missing data. Therefore, most similar data to the missing data are found. In this paper, the k-maens clustering method is used to find similar data. This method has been more accurate than other clustering methods in multivariate time series. Evolutionary algorithms are used to find the optimal effect power of each data point to remove the second constraint. Among evolutionary algorithms, evolutionary cuckoo search algorithm is used due to high convergence speed, much less probability of being trapped in local optimal points, and ability to quickly solve high dimensional optimization problems in multivariate time series problems. To evaluate the performance of the proposed method, RMS, MAE, , MSE and MAPE criteria are used. Experimental results are investigated on four UCI datasets with different percentages of missingness and in general, the proposed algorithm performs better than the other three comparative methods with an average RMSE error of 0.05, MAE error of 0.04, MSE error of 0.003, and MAPE error of 5. The correlation between the actual data and the estimated value in the proposed method is about 99%.

Download for Free

Already have an account?login

similar resources

Multiple Imputation for Missing Data

Multiple imputation provides a useful strategy for dealing with data sets with missing values. Instead of filling in a single value for each missing value, Rubin’s (1987) multiple imputation procedure replaces each missing value with a set of plausible values that represent the uncertainty about the right value to impute. These multiply imputed data sets are then analyzed by using standard proc...

full text

Online Time Series Prediction with Missing Data

We consider the problem of time series prediction in the presence of missing data. We cast the problem as an online learning problem in which the goal of the learner is to minimize prediction error. We then devise an efficient algorithm for the problem, which is based on autoregressive model, and does not assume any structure on the missing data nor on the mechanism that generates the time seri...

full text

KNN-DTW Based Missing Value Imputation for Microarray Time Series Data

Microarray technology provides an opportunity for scientists to analyze thousands of gene expression profiles simultaneously. However, microarray gene expression data often contain multiple missing expression values due to many reasons. Effective methods for missing value imputation in gene expression data are needed since many algorithms for gene analysis require a complete matrix of gene arra...

full text

Single missing data imputation in PLS-SEM

An important source of bias in structural equation modeling (SEM) employing the partial least squares method (PLS) is missing data. Deletion methods, such as listwise and pairwise deletion, have traditionally been used to deal with missing data. These methods are perceived as leading to selective loss of data and significant related biases. Missing data imputation methods, on the other hand, do...

full text

Missing Data Imputation for Time-Frequency Representations of Audio Signals

With the recent attention towards audio processing in the time-frequency domain we increasingly encounter the problem of missing data within that representation. In this paper we present an approach that allows us to recover missing values in the time-frequency domain of audio signals. The presented approach is able to deal with real-world polyphonic signals by operating seamlessly even in the ...

full text

Missing Data Imputation in Cardiac Data Set (survival Prognosis)

Treating missing value is very big task in the data preprocessing methods. Missing data are a potential source of bias when analyzing clinical trials. In this paper we analyze the performance of different data imputation methods in a task where the aim is to predict the probability of survival of cardiac patient. In this paper, comparison of handling missing data in cardiac dataset. Mean Imputa...

full text

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}

Journal title

پردازش علائم و داده ها

volume 19 issue 2

pages 39- 60

publication date 2022-09

unfollow

{@ msg @}

By following a journal you will be notified via email when a new issue of this journal is published.

Keywords

No Keywords

Hosted on Doprax cloud platform doprax.com