Department of Quantitative Social Science Missing ordinal covariates with informative selection
نویسندگان
چکیده
This paper considers the problem of parameter estimation in a model for a continuous response variable y when an important ordinal explanatory variable x is missing for a large proportion of the sample. Nonmissingness of x, or sample selection, is correlated with the response variable and/or with the unobserved values the ordinal explanatory variable takes when missing. We suggest solving the endogenous selection, or ‘not missing at random’ (NMAR), problem by modelling the informative selection mechanism, the ordinal explanatory variable, and the response variable together. The use of the method is illustrated by re-examining the problem of the ethnic gap in school achievement at age 16 in England using linked data from the National Pupil database (NPD), the Longitudinal Study of Young People in England (LSYPE), and the Census 2001. JEL classification: C13, C35, I21.
منابع مشابه
Bayesian Structural Equations Modeling for Ordinal Response Data with Missing Responses and Missing Covariates
SUNGDUK KIM, SONALI DAS, MING-HUI CHEN, AND NICHOLAS WARREN Division of Epidemiology, Statistics and Prevention Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, NIH, Rockville, Maryland, USA Logistics and Quantitative Methods, CSIR BE, PO Box 395, Pretoria, South Africa Department of Statistics, University of Connecticut, Storrs, Connecticut, USA Univer...
متن کاملBayesian Methods to Impute Missing Covariates for Causal Inference and Model Selection
BAYESIAN METHODS TO IMPUTE MISSING COVARIATES FOR CAUSAL INFERENCE AND MODEL SELECTION by Robin Mitra Department of Statistical Science Duke University
متن کاملComparison of Ordinal Response Modeling Methods like Decision Trees, Ordinal Forest and L1 Penalized Continuation Ratio Regression in High Dimensional Data
Background: Response variables in most medical and health-related research have an ordinal nature. Conventional modeling methods assume predictor variables to be independent, and consider a large number of samples (n) compared to the number of covariates (p). Therefore, it is not possible to use conventional models for high dimensional genetic data in which p > n. The present study compared th...
متن کاملMissing covariates in longitudinal data with informative dropouts: bias analysis and inference.
We consider estimation in generalized linear mixed models (GLMM) for longitudinal data with informative dropouts. At the time a unit drops out, time-varying covariates are often unobserved in addition to the missing outcome. However, existing informative dropout models typically require covariates to be completely observed. This assumption is not realistic in the presence of time-varying covari...
متن کاملSimple adjustments for randomized trials with nonrandomly missing or censored outcomes arising from informative covariates.
In randomized trials with missing or censored outcomes, standard maximum likelihood estimates of the effect of intervention on outcome are based on the assumption that the missing-data mechanism is ignorable. This assumption is violated if there is an unobserved baseline covariate that is informative, namely a baseline covariate associated with both outcome and the probability that the outcome ...
متن کامل