Dimension reduction for model-based clustering via mixtures of multivariate $$t$$ t -distributions
نویسندگان
چکیده
Dimension Reduction for Model-Based Clustering via Mixtures of Multivariate t-Distributions Katherine Morris Advisor University of Guelph, 2012 Prof. Paul D. McNicholas We introduce a dimension reduction method for model-based clustering obtained from a finite mixture of t-distributions. This approach is based on existing work on reducing dimensionality in the case of finite Gaussian mixtures. The method relies on identifying a reduced subspace of the data by considering how much group means and group covariances vary. This subspace contains linear combinations of the original data, which are ordered by importance via the associated eigenvalues. Observations can be projected onto the subspace and the resulting set of variables captures most of the clustering structure available in the data. The approach is illustrated using simulated and real data.
منابع مشابه
Robust Cluster Analysis via Mixture Models
Finite mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster data sets. In this paper, we focus on the use of normal mixture models to cluster data sets of continuous multivariate data. As normality based methods of estimation are not robust, we review the use of t component distributions. With the t mixture model-based approa...
متن کاملMixtures of common t-factor analyzers for clustering high-dimensional microarray data
MOTIVATION Mixtures of factor analyzers enable model-based clustering to be undertaken for high-dimensional microarray data, where the number of observations n is small relative to the number of genes p. Moreover, when the number of clusters is not small, for example, where there are several different types of cancer, there may be the need to reduce further the number of parameters in the speci...
متن کاملRobust Fuzzy Classification Maximum Likelihood Clustering with Multivariate t-Distributions
Mixtures of distributions have been used as probability models for clustering data. Classification maximum likelihood (CML) procedure is a popular mixture of maximum likelihood approach to clustering. Yang (1993) extended CML to fuzzy CML (FCML) for a normal mixture model, called FCML-N. However, normal distributions are not robust for outliers. In general, t-distributions should be more robust...
متن کاملRejoinder to the discussion of "Model-based clustering and classification with non-normal mixture distributions"
Non-normal mixture distributions have received increasing attention in recent years. Finite mixtures of multivariate skew-symmetric distributions, in particular, the skew normal and skew t-mixture models, are emerging as promising extensions to the traditional normal and t-mixture models. Most of these parametric families of skew distributions are closely related, and can be classified into fou...
متن کاملLocation and scale mixtures of Gaussians with flexible tail behaviour: Properties, inference and application to multivariate clustering
The family of location and scale mixtures of Gaussians has the ability to generate a number of flexible distributional forms. The family nests as particular cases several important asymmetric distributions like the Generalised Hyperbolic distribution. The Generalised Hyperbolic distribution in turn nests many other well known distributions such as the Normal Inverse Gaussian. In a multivariate ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Adv. Data Analysis and Classification
دوره 7 شماره
صفحات -
تاریخ انتشار 2013