Ensemble Based Co-Training

نویسندگان

Jafar Tanha

Maarten van Someren

Hamideh Afsarmanesh

چکیده

Recently Semi-Supervised learning algorithms such as co-training are used in many application domains. In co-training, two classifiers based on different views of data or on different learning algorithms are trained in parallel and then unlabeled data that are classified differently by the classifiers but for which one classifier has large confidence are labeled and used as training data for the other. In this paper, a new form of co-training, called Ensemble-Co-Training, is proposed that uses an ensemble of different learning algorithms. Based on a theorem by Angluin and Laird that produces an approximately correct identification with high probability for reliable examples, we propose a criterion for finding a subset of high-confidence predictions and error rate for a classifier in each iteration of the training process. Experiments show that the new method in almost all domains gives better results than the other methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Rough Set Method for Co-training Algorithm

In recent years, semi-supervised learning has been a hot research topic in machine learning area. Different from traditional supervised learning which learns only from labeled data; semi-supervised learning makes use of both labeled and unlabeled data for learning purpose. Co-training is a popular semi-supervised learning algorithm which assumes that each example is represented by two or more r...

متن کامل

Ensemble strategies to build neural network to facilitate decision making

There are three major strategies to form neural network ensembles. The simplest one is the Cross Validation strategy in which all members are trained with the same training data. Bagging and boosting strategies pro-duce perturbed sample from training data. This paper provides an ideal model based on two important factors: activation function and number of neurons in the hidden layer and based u...

متن کامل

The ensemble clustering with maximize diversity using evolutionary optimization algorithms

Data clustering is one of the main steps in data mining, which is responsible for exploring hidden patterns in non-tagged data. Due to the complexity of the problem and the weakness of the basic clustering methods, most studies today are guided by clustering ensemble methods. Diversity in primary results is one of the most important factors that can affect the quality of the final results. Also...

متن کامل

انتخاب خوشه‌های اولیه به کمک الگوریتم‌های هوشمند برای مشارکت در خوشه‌بندی ترکیبی

Most of the recent studies have tried to create diversity in primary results and then applied a consensus function over all the obtained results to combine the weak partitions. In this paper a clustering ensemble method is proposed which is based on a subset of primary clusters. The main idea behind this method is using more stable clusters in the ensemble. The stability is applied as a goodnes...

متن کامل