A cluster-assumption based batch mode active learning technique
نویسندگان
چکیده
In this paper, we propose an active learning technique for solving multiclass problems with support vector machine (SVM) classifiers. The technique is based on both uncertainty and diversity criteria. The uncertainty criterion is implemented by analyzing the one-dimensional output space of the SVM classifier. A simple histogram thresholding algorithm is used to find out the low density region in the SVM output space to identify the most uncertain samples. Then the diversity criterion exploits the kernel k-means clustering algorithm to select uncorrelated informative samples among the selected uncertain samples. To assess the effectiveness of the proposed method we compared it with other batch mode active learning techniques presented in the literature using one toy data set and three real data sets. Experimental results confirmed that the proposed technique provided a very good tradeoff among robustness to biased initial training samples, classification accuracy, computational complexity, and number of new labeled samples necessary to reach the convergence.
منابع مشابه
Batch-Mode Active Learning via Error Bound Minimization
Active learning has been proven to be quite effective in reducing the human labeling efforts by actively selecting the most informative examples to label. In this paper, we present a batch-mode active learning method based on logistic regression. Our key motivation is an out-of-sample bound on the estimation error of class distribution in logistic regression conditioned on any fixed training sa...
متن کاملAn Optimization Based Framework for Dynamic Batch Mode Active Learning
Active learning techniques have gained popularity in reducing human effort to annotate data instances for inducing a classifier. When faced with large quantities of unlabeled data, such algorithms automatically select the salient and representative samples for manual annotation. Batch mode active learning schemes have been recently proposed to select a batch of data instances simultaneously, ra...
متن کاملRNN Based Batch Mode Active Learning Framework
Active Learning has been applied in many real world classification tasks to reduce the amount of labeled data required for training a classifier. However most of the existing active learning strategies select only a single sample for labeling by the oracle in every iteration. This results in retraining the classifier after each sample is added which is quite computationally expensive. Also many...
متن کاملActive Instance Sampling via Matrix Partition
Recently, batch-mode active learning has attracted a lot of attention. In this paper, we propose a novel batch-mode active learning approach that selects a batch of queries in each iteration by maximizing a natural mutual information criterion between the labeled and unlabeled instances. By employing a Gaussian process framework, this mutual information based instance selection problem can be f...
متن کاملA Batch Mode Active Learning for Networked Data
We study a novel problem of batch mode active learning for networked data. In this problem, data instances are connected with links and their labels are correlated with each other, and the goal of batch mode active learning is to exploit the link-based dependencies and node-specific content information to actively select a batch of instances to query the user for learning an accurate model to l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition Letters
دوره 33 شماره
صفحات -
تاریخ انتشار 2012