Variable Selection for Clustering with Gaussian Mixture Models
نویسندگان
چکیده
منابع مشابه
Variable selection for clustering with Gaussian mixture models.
This article is concerned with variable selection for cluster analysis. The problem is regarded as a model selection problem in the model-based cluster analysis context. A model generalizing the model of Raftery and Dean (2006, Journal of the American Statistical Association 101, 168-178) is proposed to specify the role of each variable. This model does not need any prior assumptions about the ...
متن کاملVariable selection for clustering with Gaussian mixture models: state of the art
The mixture models have become widely used in clustering, given its probabilistic framework in which its based, however, for modern databases that are characterized by their large size, these models behave disappointingly in setting out the model, making essential the selection of relevant variables for this type of clustering. After recalling the basics of clustering based on a model, this art...
متن کاملHard-Clustering with Gaussian Mixture Models
Training the parameters of statistical models to describe a given data set is a central task in the field of data mining and machine learning. A very popular and powerful way of parameter estimation is the method of maximum likelihood estimation (MLE). Among the most widely used families of statistical models are mixture models, especially, mixtures of Gaussian distributions. A popular hard-clu...
متن کاملVariable selection in clustering via Dirichlet process mixture models
The increased collection of high-dimensional data in various fields has raised a strong interest in clustering algorithms and variable selection procedures. In this paper, we propose a model-based method that addresses the two problems simultaneously. We introduce a latent binary vector to identify discriminating variables and use Dirichlet process mixture models to define the cluster structure...
متن کاملGaussian Mixture Modeling with Gaussian Process Latent Variable Models
Density modeling is notoriously difficult for high dimensional data. One approach to the problem is to search for a lower dimensional manifold which captures the main characteristics of the data. Recently, the Gaussian Process Latent Variable Model (GPLVM) has successfully been used to find low dimensional manifolds in a variety of complex data. The GPLVM consists of a set of points in a low di...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Biometrics
سال: 2009
ISSN: 0006-341X
DOI: 10.1111/j.1541-0420.2008.01160.x