Multiply imputing missing values in data sets with mixed measurement scales using a sequence of generalised linear models

نویسندگان

  • Min Cherng Lee
  • Robin Mitra
چکیده

Abstract Multiple imputation is a commonly used approach to deal with missing values. In this approach, an imputer repeatedly imputes the missing values by taking draws from the posterior predictive distribution for the missing values conditional on the observed values, and releases these completed data sets to analysts. With each completed data set the analyst performs the analysis of interest, treating the data as if it were fully observed. These analyses are then combined with standard combining rules, allowing the analyst to make appropriate inferences which take into account the uncertainty present due to the missing data. In order to preserve the statistical properties present in the data, the imputer must use a plausible distribution to generate the imputed values. In data sets containing variables with different measurement scales, e.g. some categorical and some continuous variables, Multivariate Imputation by Chained Equations (MICE) is a commonly used multiple imputation method. However, imputations from such an approach are not necessarily drawn from

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigating the missing data effect on credit scoring rule based models: The case of an Iranian bank

Credit risk management is a process in which banks estimate probability of default (PD) for each loan applicant. Data sets of previous loan applicants are built by gathering their data, and these internal data sets are usually completed using external credit bureau’s data and finally used for estimating PD in banks. There is also a continuous interest for bank to use rule based classifiers to b...

متن کامل

Data-driven methods for imputing national-level incidence in global burden of disease studies

OBJECTIVE To develop transparent and reproducible methods for imputing missing data on disease incidence at national-level for the year 2005. METHODS We compared several models for imputing missing country-level incidence rates for two foodborne diseases - congenital toxoplasmosis and aflatoxin-related hepatocellular carcinoma. Missing values were assumed to be missing at random. Predictor va...

متن کامل

Fitting multilevel multivariate models with missing data in responses and covariates that may include interactions and nonlinear terms

The paper extends existing models for multilevel multivariate data with mixed response types to handle quite general types and patterns of missing data values in a wide range of multilevel generalized linear models. It proposes an efficient Bayesian modelling approach that allows missing values in covariates, including models where there are interactions or other functions of covariates such as...

متن کامل

An Estimation of Missing Values by Modified Mixed Kernels

----In statistical practices, difficulties of missing data are universal. Several techniques are used to handle this dilemma of missing data. They include both old approaches, which require only a small amount of mathematical computations and new approaches, which require additional difficult computations that are ever easier for social work researchers to carry out the statistical programming ...

متن کامل

Multiple imputation and other resampling schemes for imputing missing observations

The problem of imputing missing observations under the linear regression model is considered. It is assumed that observations are missing at random and all the observations on the auxiliary or independent variables are available. Estimates of the regression parameters based on singly and multiply imputed values are given. Jackknife as well as bootstrap estimates of the variance of the singly im...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 95  شماره 

صفحات  -

تاریخ انتشار 2016