Wrapper feature selection with partially labeled data

نویسندگان

چکیده

In this paper, we propose a new feature selection approach with partially labeled training examples in the multi-class classification setting. It is based on modification of genetic algorithm that creates and evaluates candidate subsets during an evolutionary process, taking into account weights recursively eliminating irrelevant features. To increase variety data, unlabeled observations are employed namely by pseudo-labeling them using self-learning recently proposed transductive policy. Empirical results different data sets show effectiveness our method compared to several state-of-the-art semi-supervised approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Wrapper Feature Selection Approach to Classification with Missing Data

Many industrial and real-world datasets suffer from an unavoidable problem of missing values. The problem of missing data has been addressed extensively in the statistical analysis literature, and also, but to a lesser extent in the classification literature. The ability to deal with missing data is an essential requirement for classification because inadequate treatment of missing data may lea...

متن کامل

Wrapper Feature Selection

INTRODUCTION It is well known that the performance of most data mining algorithms can be deteriorated by features that do not add any value to learning tasks. Feature selection can be used to limit the effects of such features by seeking only the relevant subset from the original features (de Souza et al., 2006). This subset of the relevant features is discovered by removing those that are cons...

متن کامل

Wrapper for Ranking Feature Selection

We propose a new feature selection criterion not based on calculated measures between attributes, or complex and costly distance calculations. Applying a wrapper to the output of a new attribute ranking method, we obtain a minimum subset with the same error rate as the original data. The experiments were compared to two other algorithms with the same results, but with a very short computation t...

متن کامل

Optimizing Wrapper-Based Feature Selection for Use on Bioinformatics Data

High dimensionality (having a large number of independent attributes) is a major problem for bioinformatics datasets such as gene microarray datasets. Feature selection algorithms are necessary to remove the irrelevant (not useful) and redundant (contain duplicate information) features. One approach to handle this problem is wrapper-based subset evaluation, which builds classification models on...

متن کامل

Parallel GA-Based Wrapper Feature Selection for Spectroscopic Data Mining

Mining predictive models in dense databases is CPU time consuming and I/O intensive. In this paper, we propose a taxonomy of existing techniques allowing to achieve high performance. We propose a hybrid approach allowing to exploit four of them: feature selection, GA-based exploration space reduction, parallelism and concurrency. The approach is experimented on a near-infrared ( ) spectroscopic...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied Intelligence

سال: 2022

ISSN: ['0924-669X', '1573-7497']

DOI: https://doi.org/10.1007/s10489-021-03076-w