Genetic algorithms for linear feature extraction

نویسندگان

  • Alberto J. Pérez Jiménez
  • Juan Carlos Pérez-Cortes
چکیده

Feature extraction is a commonly used technique applied before classification when a number of measures, or features, have been taken from a set of objects in a typical statistical pattern recognition task. The goal is to define a mapping from the original representation space into a new space where the classes are more easily separable. This will reduce the classifier complexity, increasing in most cases classifier accuracy. Feature extraction methods can be divided into linear and non-linear, depending on the nature of the mapping function (Lerner et al., 1998). They can also be classified as supervised or unsupervised, depending on whether the class information is taken into account or not. Feature extraction can also be used for exploratory data analysis, where the aim is not to improve classification accuracy, but to visualise high dimensional data by mapping it into the plane or the 3dimensional space. The best known linear methods are Principal Component Analysis, or PCA (unsupervised) (Fukunaga, 1990), Linear Discriminant Analysis or LDA (supervised) (Fukunaga, 1990; Aladjem, 1991; Siedlecki et al., 1988), and Independent Component Analysis or ICA (unsupervised) (Cardoso, 1993). Schematically, PCA preserves as much variance of the data as possible, LDA attempts to group patterns of the same class, while separating them from the other classes, and ICA obtains a new set of features by extracting the less correlated (in a broad sense) directions in the data set. On the other hand, well-known non-linear methods are: Sammon’s Mapping (unsupervised) (Sammon, 1969; Siedlecki et al. 1988), non-linear discriminant analysis or NDA (supervised) (Mao & Jain, 1995), Kohonen’s self-organising map (unsupervised) (Kohonen, 1990) and evolutionary extraction (supervised) (Liu & Motoda, 1998). Sammon’s mapping tries to keep the distances among the observations using hill-climbing or neural network methods (Mao & Jain, 1995; Sammon, 1969), NDA obtains new features from the coefficients of the second hidden layers of a multi-layer perceptron (MLP) (Mao & Jain, 1995) and Kohonen Maps project data in an attempt to preserve the topology. Finally, evolutionary extraction uses a genetic algorithm to find combinations of original features in order to improve classifier accuracy. These new features are obtained by multiplying, dividing, adding or subtracting the original features. In the linear methods, the mapping function is known and simple; therefore, the task is reduced to finding the coefficients of the linear transformation by maximising or minimising

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature selection using genetic algorithm for classification of schizophrenia using fMRI data

In this paper we propose a new method for classification of subjects into schizophrenia and control groups using functional magnetic resonance imaging (fMRI) data. In the preprocessing step, the number of fMRI time points is reduced using principal component analysis (PCA). Then, independent component analysis (ICA) is used for further data analysis. It estimates independent components (ICs) of...

متن کامل

Hyperspectral Image Classification Based on the Fusion of the Features Generated by Sparse Representation Methods, Linear and Non-linear Transformations

The ability of recording the high resolution spectral signature of earth surface would be the most important feature of hyperspectral sensors. On the other hand, classification of hyperspectral imagery is known as one of the methods to extracting information from these remote sensing data sources. Despite the high potential of hyperspectral images in the information content point of view, there...

متن کامل

Comparison of Parametric and Non-parametric EEG Feature Extraction Methods in Detection of Pediatric Migraine without Aura

Background: Migraine headache without aura is the most common type of migraine especially among pediatric patients. It has always been a great challenge of migraine diagnosis using quantitative electroencephalography measurements through feature classification. It has been proven that different feature extraction and classification methods vary in terms of performance regarding detection and di...

متن کامل

Sequential and Mixed Genetic Algorithm and Learning Automata (SGALA, MGALA) for Feature Selection in QSAR

Feature selection is of great importance in Quantitative Structure-Activity Relationship (QSAR) analysis. This problem has been solved using some meta-heuristic algorithms such as: GA, PSO, ACO, SA and so on. In this work two novel hybrid meta-heuristic algorithms i.e. Sequential GA and LA (SGALA) and Mixed GA and LA (MGALA), which are based on Genetic algorithm and learning automata for QSAR f...

متن کامل

Sequential and Mixed Genetic Algorithm and Learning Automata (SGALA, MGALA) for Feature Selection in QSAR

Feature selection is of great importance in Quantitative Structure-Activity Relationship (QSAR) analysis. This problem has been solved using some meta-heuristic algorithms such as: GA, PSO, ACO, SA and so on. In this work two novel hybrid meta-heuristic algorithms i.e. Sequential GA and LA (SGALA) and Mixed GA and LA (MGALA), which are based on Genetic algorithm and learning automata for QSAR f...

متن کامل

A Direct Evolutionary Feature Extraction Algorithm for Classifying High Dimensional Data

Among various feature extraction algorithms, those based on genetic algorithms are promising owing to their potential parallelizability and possible applications in large scale and high dimensional data classification. However, existing genetic algorithm based feature extraction algorithms are either limited in searching optimal projection basis vectors or costly in both time and space complexi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 27  شماره 

صفحات  -

تاریخ انتشار 2006