A data-driven missing value imputation approach for longitudinal datasets

نویسندگان

چکیده

Abstract Longitudinal datasets of human ageing studies usually have a high volume missing data, and one way to handle values in dataset is replace them with estimations. However, there are many methods estimate values, no single method the best for all datasets. In this article, we propose data-driven value imputation approach that performs feature-wise selection method, using known information rank five selected, based on their estimation error rates. We evaluated proposed two sets experiments: classifier-independent scenario, where compared applicabilities rates each method; classifier-dependent predictive accuracy Random Forest classifiers generated prepared baseline doing (letting classification algorithm internally). Based our results from both experiments, concluded generally resulted models more accurate estimations data better performing classifiers, longitudinal ageing. also observed devised specifically had very This reinforces idea temporal intrinsic worthwhile endeavour machine learning applications, can be achieved through approach.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Missing data imputation in multivariable time series data

Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...

متن کامل

A Structured Prediction Approach for Missing Value Imputation

Missing value imputation is an important practical problem. There is a large body of work on it, but there does not exist any work that formulates the problem in a structured output setting. Also, most applications have constraints on the imputed data, for example on the distribution associated with each variable. None of the existing imputation methods use these constraints. In this paper we p...

متن کامل

Missing Value Imputation Based on Data Clustering

We propose an efficient nonparametric missing value imputation method based on clustering, called CMI (Clustering-based Missing value Imputation), for dealing with missing values in target attributes. In our approach, we impute the missing values of an instance A with plausible values that are generated from the data in the instances which do not contain missing values and are most similar to t...

متن کامل

BIOINFORMATICS Collateral Missing Value Imputation: A New Robust Missing Value Estimation Algorithm For Microarray Data

Motivation: Microarray data is used in a range of application areas in biology, though often it contains considerable numbers of missing values. These missing values can significantly affect subsequent statistical analysis and machine learning algorithms so there is a strong motivation to estimate these values as accurately as possible prior to using these algorithms. While many imputation algo...

متن کامل

Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data

MOTIVATION Microarray data are used in a range of application areas in biology, although often it contains considerable numbers of missing values. These missing values can significantly affect subsequent statistical analysis and machine learning algorithms so there is a strong motivation to estimate these values as accurately as possible before using these algorithms. While many imputation algo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Artificial Intelligence Review

سال: 2021

ISSN: ['0269-2821', '1573-7462']

DOI: https://doi.org/10.1007/s10462-021-09963-5