Supervised Feature Subset Selection using Extended Fuzzy Absolute Information Measure for Different Classifiers

نویسنده

  • K. Sarojini
چکیده

Feature subset selection plays an important role in data mining and machine learning applications. The main aim of feature subset selection is reducing dimensionality by removing irrelevant and redundant features and improving classification accuracy. This paper presents a supervised feature selection method called as Extended Fuzzy Absolute Information Measure (EFAIM) for different classifiers. In this process, first discretization algorithm is applied to discretize numeric and nominal features of a database to construct fuzzy sets of a feature. Then the method EFAIM is applied to select feature subset focusing on boundary samples. To verify the effectiveness of this method, several experiments are conducted for different classifiers, such as, LMT, Naïve Bayes, SMO, C4.5, JRIP, PART and Simple Cart with different UCI datasets. The Experimental results indicates that the proposed algorithm have achieved better classification accuracy for all datasets, that is, almost above 75% of accuracy. For WINE dataset, it gets 96% of classification accuracy for Naïve Bayes classifier. For Ionosphere dataset, it gives almost 89% of classification accuracy for maximum of classifiers with minimum selected feature subset. Thus improved classification accuracy is obtained with selected subset of minimum number of features at minimum processing time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Supervised Feature Subset Selection Based On Extended Fuzzy Relative Information Measure For Boundary Samples

Feature subset selection is an essential preprocessing task in data mining. This paper presents a new method called Extended Fuzzy Relative Information Measure for Boundary Samples (EFRIMBS) for dealing with supervised feature subset selection. The proposed algorithm uses boundary samples instead of full set of samples. First, Discretization algorithms such as K-Means, Fuzzy C Means and Median ...

متن کامل

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

Support Vector Machine Based Facies Classification Using Seismic Attributes in an Oil Field of Iran

Seismic facies analysis (SFA) aims to classify similar seismic traces based on amplitude, phase, frequency, and other seismic attributes. SFA has proven useful in interpreting seismic data, allowing significant information on subsurface geological structures to be extracted. While facies analysis has been widely investigated through unsupervised-classification-based studies, there are few cases...

متن کامل

Feature Selection Using Multi Objective Genetic Algorithm with Support Vector Machine

Different approaches have been proposed for feature selection to obtain suitable features subset among all features. These methods search feature space for feature subsets which satisfies some criteria or optimizes several objective functions. The objective functions are divided into two main groups: filter and wrapper methods.  In filter methods, features subsets are selected due to some measu...

متن کامل

Evaluation of Feature Selection Methods for Object-Based Land Cover Mapping of Unmanned Aerial Vehicle Imagery Using Random Forest and Support Vector Machine Classifiers

The increased feature space available in object-based classification environments (e.g., extended spectral feature sets per object, shape properties, or textural features) has a high potential of improving classifications. However, the availability of a large number of derived features per segmented object can also lead to a time-consuming and subjective process of optimizing the feature subset...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014