A hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts
نویسندگان
چکیده مقاله:
High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough set and hesitant fuzzy sets for developing an effective algorithm is the maincontribution of this paper. The mentioned method has two steps, in the first step, four discretization approaches areapplied to discretize continuous datasets and selects a primary subset of features by combining of weighted rough setdependency degree and information gain via hesitant fuzzy aggregation approach. In the second step, a significancemeasure of features (defined by fuzzy rough concepts) is employed to remove redundant features from primary set.The Wilcoxon Signed Ranked tes (A Non-parametric statistical test) is conducted for comparing the presented methodwith ten feature selection methods across seven datasets. The results of experiments show that the proposed methodis able to select a significant subset of features and it is an effective method in the literature in terms of classificationperformance and simplicity.
منابع مشابه
Hesitant fuzzy rough sets through hesitant fuzzy relations
Introducing rough sets in hesitant fuzzy set domain and using it for the various applications would open up new possibilities in rough set theory. For this purpose the notion of hesitant fuzzy relations is introduced. The foundation of equivalence hesitant fuzzy relation is laid. Definition of anti-reflexive kernel, symmetric kernel etc. is proposed and the formulae to evaluate them are derived...
متن کاملFeature subset selection based on fuzzy neighborhood rough sets
Rough set theory has been extensively discussed in machine learning and pattern recognition. It provides us another important theoretical tool for feature selection. In this paper, we construct a novel rough set model for feature subset selection. First, we define the fuzzy decision of a sample by using the concept of fuzzy neighborhood. A parameterized fuzzy relation is introduced to character...
متن کاملFuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection
Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...
متن کاملCombining rough and fuzzy sets for feature selection
Feature selection (FS) refers to the problem of selecting those input attributes that are most predictive of a given outcome; a problem encountered in many areas such as machine learning, pattern recognition and signal processing. Unlike other dimensionality reduction methods, feature selectors preserve the original meaning of the features after reduction. This has found application in tasks th...
متن کاملSoft fuzzy rough sets for robust feature evaluation and selection
The fuzzy dependency function proposed in the fuzzy rough set model is widely employed in feature evaluation and attribute reduction. It is shown that this function is not robust to noisy information in this paper. As datasets in real-world applications are usually contaminated by noise, robustness of data analysis models is very important in practice. In this work, we develop a new model of fu...
متن کاملOn fuzzy-rough sets approach to feature selection
In this paper, we have shown that the fuzzy-rough set attribute reduction algorithm [Jenson, R., Shen, Q., 2002. Fuzzy-rough sets for descriptive dimensionality reduction. In: Proceedings of IEEE International Conference on Fuzzy Systems, FUZZ-IEEE'02, May 12-17, pp. 29-34] is not convergent on many real datasets due to its poorly designed termination criteria; and the computational complexity ...
متن کاملمنابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ذخیره در منابع من قبلا به منابع من ذحیره شده{@ msg_add @}
عنوان ژورنال
دوره 16 شماره 2
صفحات 165- 182
تاریخ انتشار 2019-03-01
با دنبال کردن یک ژورنال هنگامی که شماره جدید این ژورنال منتشر می شود به شما از طریق ایمیل اطلاع داده می شود.
میزبانی شده توسط پلتفرم ابری doprax.com
copyright © 2015-2023