Generalized Latent Class Analysis Based on Model Dominance Theory
نویسندگان
چکیده
Latent class analysis is a popular statistical learning approach. A major challenge for learning generalized latent class is the complexity in searching the huge space of models and parameters. The computational cost is higher when the model topology is more flexible. In this paper, we propose the notion of dominance which can lead to strong pruning of the search space and significant reduction of learning complexity, and apply this notion to the Generalized Latent Class (GLC) models, a class of Bayesian networks for clustering categorical data. GLC models can address the local dependence problem in latent class analysis by assuming a very general graph structure. However, The flexible topology of GLC leads to large increase of the learning complexity. We first propose the concept of dominance and related theoretical results which is general for all Bayesian networks. Based on dominance, we propose an efficient learning algorithm for GLC. A core technique to prune dominated models is regularization, which can eliminate dominated models, leading to significant pruning of the search space. Significant improvements on the model
منابع مشابه
An application of Measurement error evaluation using latent class analysis
Latent class analysis (LCA) is a method of evaluating non sampling errors, especially measurement error in categorical data. Biemer (2011) introduced four latent class modeling approaches: probability model parameterization, log linear model, modified path model, and graphical model using path diagrams. These models are interchangeable. Latent class probability models express l...
متن کاملUsing multivariate generalized linear latent variable models to measure the difference in event count for stranded marine animals
BACKGROUND AND OBJECTIVES: The classification of marine animals as protected species makes data and information on them to be very important. Therefore, this led to the need to retrieve and understand the data on the event counts for stranded marine animals based on location emergence, number of individuals, behavior, and threats to their presence. Whales are g...
متن کاملبهکارگیری متغیرهای پنهان در مدل رگرسیون لجستیک برای حذف اثر همخطی چندگانه در تحلیل برخی عوامل مرتبط با سرطان پستان
Background and Objectives: Logistic regression is one of the most widely used generalized linear models for analysis of the relationships between one or more explanatory variables and a categorical response. Strong correlations among explanatory variables (multicollinearity) reduce the efficiency of model to a considerable degree. In this study we used latent variables to reduce the effects of ...
متن کاملStochastic dominance-based rough set model for ordinal classification
In order to discover interesting patterns and dependencies in data, an approach based on rough set theory can be used. In particular, Dominance-based Rough Set Approach (DRSA) has been introduced to deal with the problem of ordinal classification with monotonicity constraints (also referred to as multicriteria classification in decision analysis). However, in real-life problems, in the presence...
متن کاملPresentation of Economic Regeneration Model in Historic Fabric Based on Order in Structural Functionalism Theory
Historic fabric can perform an important role in the development of cities. Urban sustainable regeneration is one of the recent approaches in historic fabric. In this approach, all indicator of sustainable development including economic, social, cultural, management and environmental dimensions have been used in conservation of the historic fabric. All the principles of sustainable development ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- International Journal on Artificial Intelligence Tools
دوره 18 شماره
صفحات -
تاریخ انتشار 2009