Unsupervised feature selection using weighted principal components
نویسندگان
چکیده
Feature selection has received considerable attention in various areas as a way to select informative features and to simplify the statistical model through dimensional reduction. One of the most widely used methods for dimensional reduction includes principal component analysis (PCA). Despite its popularity, PCA suffers from a lack of interpretability of the original feature because the reduced dimensions are linear combinations of a large number of original features. Traditionally, two or three dimensional loading plots provide information to identify important original features in the first few principal component dimensions. However, the interpretation of what constitutes a loading plot is frequently subjective, particularly when large numbers of features are involved. In this study, we propose an unsupervised feature selection method that combines weighted principal components (PCs) with a thresholding algorithm. The weighted PC is obtained by the weighted sum of the first k PCs of interest. Each of the k loading values in the weighted PC reflects the contribution of each individual feature. We also propose a thresholding algorithm that identifies the significant features. Our experimental results with both the simulated and real datasets demonstrated the effectiveness of the proposed unsupervised feature selection method.
منابع مشابه
Feature selection using genetic algorithm for classification of schizophrenia using fMRI data
In this paper we propose a new method for classification of subjects into schizophrenia and control groups using functional magnetic resonance imaging (fMRI) data. In the preprocessing step, the number of fMRI time points is reduced using principal component analysis (PCA). Then, independent component analysis (ICA) is used for further data analysis. It estimates independent components (ICs) of...
متن کاملGlobally Sparse Probabilistic PCA
With the flourishing development of highdimensional data, sparse versions of principal component analysis (PCA) have imposed themselves as simple, yet powerful ways of selecting relevant features in an unsupervised manner. However, when several sparse principal components are computed, the interpretation of the selected variables may be difficult since each axis has its own sparsity pattern and...
متن کاملCluster-Dependent Feature Selection through a Weighted Learning Paradigm
This paper addresses the problem of selecting a subset of the most relevant features from a dataset through a weighted learning paradigm. We propose two automated feature selection algorithms for unlabeled data. In contrast to supervised learning, the problem of automated feature selection and feature weighting in the context of unsupervised learning is challenging, because label information is...
متن کاملAutomatic feature selection for unsupervised clustering of cycle-based signals in manufacturing processes
Recent developments in sensing and computer technology have resulted in most manufacturing processes becoming a data-rich environment. A cycle-based signal refers to an analog or digital signal that is obtained during each repetition of an operation cycle in a manufacturing process. It is a very important class of in-process sensing signals for manufacturing processes because it contains extens...
متن کاملUnsupervised Parallel Feature Extraction from First Principles
We describe a number of learning rules that can be used to train unsupervised parallel feature extraction systems. The learning rules are derived using gradient ascent of a quality function. We consider a number of quality functions that are rational functions of higher order moments of the extracted feature values. We show that one system learns the principle components of the correlation matr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Expert Syst. Appl.
دوره 38 شماره
صفحات -
تاریخ انتشار 2011