Diversifying Convex Transductive Experimental Design for Active Learning

نویسندگان

  • Lei Shi
  • Yi-Dong Shen
چکیده

Convex Transductive Experimental Design (CTED) is one of the most representative active learning methods. It utilizes a data reconstruction framework to select informative samples for manual annotation. However, we observe that CTED cannot well handle the diversity of selected samples and hence the set of selected samples may contain mutually similar samples which convey similar or overlapped information. This is definitely undesired. Given limited budget for data labeling, it is desired to select informative samples with complementary information, i.e., similar samples are excluded. To this end, we proposes Diversified CTED by seamlessly incorporating a novel and effective diversity regularizer into CTED, ensuring the selected samples are diverse. The involvement of the diversity regularizer leads the optimization problem hard to solve. We derive an effective algorithm to solve an equivalent problem which is easier to optimize. Extensive experimental results on several benchmark data sets demonstrate that Diversified CTED significantly improves CTED and consistently outperforms the state-of-the-art methods, verifying the effectiveness and advantages of incorporating the proposed diversity regularizer into CTED.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transductive Learning with Multi-class Volume Approximation

Given a hypothesis space, the large volume principle by Vladimir Vapnik prioritizes equivalence classes according to their volume in the hypothesis space. The volume approximation has hitherto been successfully applied to binary learning problems. In this paper, we propose a novel generalization to multiple classes, allowing applications of the large volume principle on more learning problems s...

متن کامل

Discriminative Experimental Design

Since labeling data is often both laborious and costly, the labeled data available in many applications is rather limited. Active learning is a learning approach which actively selects unlabeled data points to label as a way to alleviate the labeled data deficiency problem. In this paper, we extend a previous active learning method called transductive experimental design (TED) by proposing a ne...

متن کامل

Transductive Confidence Machine for Active Learning

This paper describes a novel active learning strategy using universal p-value measures of confidence based on algorithmic randomness, and transductive inference. The early stopping criteria for active learning is based on the bias-variance tradeoff for classification. This corresponds to that learning instance when the boundary bias becomes positive, and requires one to switch from active to ra...

متن کامل

Transductive Experiment Design

This paper considers the problem of selecting the most informative experiments x to get measures y for learning an inference model y = f(x). We propose a novel concept for active learning, transductive experiment design, to overcome the shortcomings of existing experiment design methods, e.g. insufficient exploration of available unmeasured data and poor scalability for large data sets. In-dept...

متن کامل

Repairing self-confident active-transductive learners using systematic exploration

We consider an active learning game within a transductive learning model. A major problem with many active learning algorithms is that an unreliable current hypothesis can mislead the querying component to query “uninformative” points. In this work we propose a remedy to this problem. Our solution can be viewed as a “patch” for fixing this deficiency and also as a proposed modular approach for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016