SEM with Missing Data and Unknown Population Distributions Using Two-Stage ML: Theory and Its Application.

نویسندگان

  • Ke-Hai Yuan
  • Laura Lu
چکیده

This article provides the theory and application of the 2-stage maximum likelihood (ML) procedure for structural equation modeling (SEM) with missing data. The validity of this procedure does not require the assumption of a normally distributed population. When the population is normally distributed and all missing data are missing at random (MAR), the direct ML procedure is nearly optimal for SEM with missing data. When missing data mechanisms are unknown, including auxiliary variables in the analysis will make the missing data mechanism more likely to be MAR. It is much easier to include auxiliary variables in the 2-stage ML than in the direct ML. Based on most recent developments for missing data with an unknown population distribution, the article first provides the least technical material on why the normal distribution-based ML generates consistent parameter estimates when the missing data mechanism is MAR. The article also provides sufficient conditions for the 2-stage ML to be a valid statistical procedure in the general case. For the application of the 2-stage ML, an SAS IML program is given to perform the first-stage analysis and EQS codes are provided to perform the second-stage analysis. An example with open- and closed-book examination data is used to illustrate the application of the provided programs. One aim is for quantitative graduate students/applied psychometricians to understand the technical details for missing data analysis. Another aim is for applied researchers to use the method properly.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Normal Theory GLS Estimator for Missing Data: An Application to Item-Level Missing Data and a Comparison to Two-Stage ML

Structural equation models (SEMs) can be estimated using a variety of methods. For complete normally distributed data, two asymptotically efficient estimation methods exist: maximum likelihood (ML) and generalized least squares (GLS). With incomplete normally distributed data, an extension of ML called "full information" ML (FIML), is often the estimation method of choice. An extension of GLS t...

متن کامل

Two-stage estimation using copula function

‎Maximum likelihood estimation of multivariate distributions needs solving a optimization problem with large dimentions (to the number of unknown parameters) but two‎- ‎stage estimation divides this problem to several simple optimizations‎. ‎It saves significant amount of computational time‎. ‎Two methods are investigated for estimation consistency check‎. ‎We revisit Sankaran and Nair's bivari...

متن کامل

Scale Efficient Targets in Production Systems With Two-stage Structure Under Imprecise Data Assumption

Traditional data envelopment analysis (DEA) models evaluate two-stage decision making unit (DMU) as a black box and neglect the connectivity may exist among the stages. This paper looks inside the system by considering the intermediate activities between the stages where the first stage uses inputs to produce outputs which are the inputs to the second stage along with its own inputs. Additional...

متن کامل

A Method to Expand Family of Continuous Distributions based on Truncated Distributions

 Abstract: A new method to generate various family of distributions is introduced. This method introduces a new two-parameter extension of the exponential distribution to illustrate its application. Some statistical and reliability properties of the new distribution, including explicit expressions for the moments, quantiles, mode, moment generating function, mean residual lifetime, stochas...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Multivariate behavioral research

دوره 43 4  شماره 

صفحات  -

تاریخ انتشار 2008