Imbalanced Multiple Noisy Labeling for Supervised Learning
نویسندگان
چکیده
When labeling objects via Internet-based outsourcing systems, the labelers may have bias, because they lack expertise, dedication and personal preference. These reasons cause Imbalanced Multiple Noisy Labeling. To deal with the imbalance labeling issue, we propose an agnostic algorithm PLAT (Positive LAbel frequency Threshold) which does not need any information about quality of labelers and underlying class distribution. Simulations on eight realworld datasets with different underlying class distributions demonstrate that PLAT not only effectively deals with the imbalanced multiple noisy labeling problem that off-theshelf agnostic methods cannot cope with, but also performs nearly the same as majority voting under the circumstances that labelers have no bias.
منابع مشابه
Deep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective
The success of current deep saliency detection methods heavily depends on the availability of large-scale supervision in the form of per-pixel labeling. Such supervision, while labor-intensive and not always possible, tends to hinder the generalization ability of the learned models. By contrast, traditional handcrafted features based unsupervised saliency detection methods, even though have bee...
متن کاملImproving Labeling Quality using Positive Label Frequency Threshold Algorithm
Label is a prominent issue in the classification area along with several potential negative sequences. For example, the predicted accuracy may reduce, but the complexity of inferred models and the number of necessary training samples may rise. Online outsourcing systems, such as Amazon’s Mechanical Turk, allow labelers to label the same objects but still lack in their quality. Mostly noisy labe...
متن کاملComparing the Max and Noisy-Or Pooling Functions in Multiple Instance Learning for Weakly Supervised Sequence Learning Tasks
Many sequence learning tasks require the localization of certain events in sequences. Because it can be expensive to obtain strong labeling that specifies the starting and ending times of the events, modern systems are often trained with weak labeling without explicit timing information. Multiple instance learning (MIL) is a popular framework for learning from weak labeling. In a common scenari...
متن کاملCorrActive Learning: Learning from Noisy Data through Human Interaction
We introduce a new framework of supervised machine learning called CorrActive Learning, short for Corrective Active Learning. Similar to active learning, this setting involves learning through human interaction. However, unlike active learning which aims to acquire labels for unlabeled examples, corrActive learning addresses the problem where the set of training data provided to the supervised ...
متن کاملDeep learning from crowds
Over the last few years, deep learning has revolutionized the field of machine learning by dramatically improving the state-of-the-art in various domains. However, as the size of supervised artificial neural networks grows, typically so does the need for larger labeled datasets. Recently, crowdsourcing has established itself as an efficient and cost-effective solution for labeling large sets of...
متن کامل