Developing a Filter-Wrapper Feature Selection Method and its Application in Dimension Reduction of Gen Expression

Authors

Abstract:

Nowadays, increasing the volume of data and the number of attributes in the dataset has reduced the accuracy of the learning algorithm and the computational complexity. A dimensionality reduction method is a feature selection method, which is done through filtering and wrapping. The wrapper methods are more accurate than filter ones but perform faster and have a less computational burden. With respect to the advantages and disadvantages of the filter and wrapper algorithms, a new hybrid approach is proposed in this study. In the method, all features in the dataset are considered, then the optimal subset of features is selected by combining the feature selection filter algorithms and evaluating their results using the wrapper method. Considering the many diseases and biosystem issues, such as cancer, can be identified and diagnosed by microarray data analysis and considering that there are many features in such datasets, the method proposed in this paper has been evaluated on microarray data related to three types of cancers.  Compared with similar methods, the results show the proposed method benefits from high accuracy in classifying and identifying the affecting factors on cancer.

Upgrade to premium to download articles

Sign up to access the full text

Already have an account?login

similar resources

A hybrid wrapper / filter approach for feature subset selection

This work presents a hybrid wrapper/filter algorithm for feature subset selection that can use a combination of several quality criteria measures to rank the set of features of a dataset. These ranked features are used to prune the search space of subsets of possible features such that the number of times the wrapper executes the learning algorithm for a dataset with M features is reduced to O(...

full text

Wrapper-Filter Feature Selection Algorithm Using a Memetic Framework

This correspondence presents a novel hybrid wrapper and filter feature selection algorithm for a classification problem using a memetic framework. It incorporates a filter ranking method in the traditional genetic algorithm to improve classification performance and accelerate the search in identifying the core feature subsets. Particularly, the method adds or deletes a feature from a candidate ...

full text

A Two-phase Feature Selection Method using both Filter and Wrapper

Feature selection is an integral step of data mining process to find an optimal subset of features. After examine the problems with both the filter and wrapper approach to feature selection, we propose a two-phase feature selection algorithm of filter and wrapper that can take advantage of both approaches. It begins by running GFSIC(fi1ter approach) to remove irrelevant features, then it runs S...

full text

Fuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection

Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...

full text

A Hybrid Both Filter and Wrapper Feature Selection Method for Microarray Classification

expression data is widely used in disease analysis and cancer diagnosis. However, since gene expression data could contain thousands of genes simultaneously, successful microarray classification is rather difficult. Feature selection is an important pre-treatment for any classification process. Selecting a useful gene subset as a classifier not only decreases the computational time and cost, bu...

full text

IG-GA: A Hybrid Filter/Wrapper Method for Feature Selection of Microarray Data

Gene expression profiles have great potential as a medical diagnostic tool since they represent the state of a cell at the molecular level. Available training data sets for classification of cancer types generally have a fairly small sample size compared to the number of genes involved. This fact poses an insurmountable problem to some classification methodologies due to training data limitatio...

full text

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}


Journal title

volume 6  issue 2

pages  48- 59

publication date 2017-09

By following a journal you will be notified via email when a new issue of this journal is published.

Keywords

No Keywords

Hosted on Doprax cloud platform doprax.com

copyright © 2015-2023