Comparative Study of Four Methods in Missing Value Imputations under Missing Completely at Random Mechanism
نویسندگان
چکیده
In analyzing data from clinical trials and longitudinal studies, the issue of missing values is always a fundamental challenge since the missing data could introduce bias and lead to erroneous statistical inferences. To deal with this challenge, several imputation methods have been developed in the literature to handle missing values where the most commonly used are complete case method, mean imputation method, last observation carried forward (LOCF) method, and multiple imputation (MI) method. In this paper, we conduct a simulation study to investigate the efficiency of these four typical imputation methods with longitudinal data setting under missing completely at random (MCAR). We categorize missingness with three cases from a lower percentage of 5% to a higher percentage of 30% and 50% missingness. With this simulation study, we make a conclusion that LOCF method has more bias than the other three methods in most situations. MI method has the least bias with the best coverage probability. Thus, we conclude that MI method is the most effective imputation method in our MCAR simulation study.
منابع مشابه
A Comparative Review of Selection Models in Longitudinal Continuous Response Data with Dropout
Missing values occur in studies of various disciplines such as social sciences, medicine, and economics. The missing mechanism in these studies should be investigated more carefully. In this article, some models, proposed in the literature on longitudinal data with dropout are reviewed and compared. In an applied example it is shown that the selection model of Hausman and Wise (1979, Econometri...
متن کاملReview: a gentle introduction to imputation of missing values.
In most situations, simple techniques for handling missing data (such as complete case analysis, overall mean imputation, and the missing-indicator method) produce biased results, whereas imputation techniques yield valid results without complicating the analysis once the imputations are carried out. Imputation techniques are based on the idea that any subject in a study sample can be replaced ...
متن کاملMultiple Imputation Models in the 2002 Environmental Sustainability Index
Missing data arise in many situations and pose more than a technical problem to the analyst. Software applications often require complete datasets, but more importantly, incomplete or missing observations may impact on the validity of the statistical analysis. Ad-hoc solutions, such as case or listwise deletion, mean imputation, and simple regression methods can lead to severe bias if the popul...
متن کاملMarginal Analysis of A Population-Based Genetic Association Study of Quantitative Traits with Incomplete Longitudinal Data
A common study to investigate gene-environment interaction is designed to be longitudinal and population-based. Data arising from longitudinal association studies often contain missing responses. Naive analysis without taking missingness into account may produce invalid inference, especially when the missing data mechanism depends on the response process. To address this issue in the ana...
متن کاملMultiple Imputation for Missing Data: A Cautionary Tale
Two algorithms for producing multiple imputations for missing data are evaluated with simulated data. Software using a propensity score classifier with the approximate Bayesian boostrap produces badly biased estimates of regression coefficients when data on predictor variables are missing at random or missing completely at random. On the other hand, a regression-based method employing the data ...
متن کامل