Supplementary Material – Ambiguity Helps: Classification with Disagreements in Crowdsourced Annotations
نویسندگان
چکیده
In this section we give all the necessary details to implement the EP algorithm [1] for the GPCconf method described in the main manuscript. We show how to compute the EP posterior approximation from the product of all approximate factors and how to implement the EP updates to refine each approximate factor. We also show how to compute the EP approximation of the marginal likelihood and its gradients. Recall from the main manuscript that in EP the approximate factors replace the corresponding exact factors of the likelihood in the joint distribution p(y|X,Xconf, f ,g)p(f)p(g). The resulting approximate joint distribution is then normalized to get the EP posterior approximation, and the normalization constant is the approximation to the marginal likelihood. Furthermore, we would like to recall that in the case of GPCconf, the n-th likelihood factor to be approximated by EP is:
منابع مشابه
Experiments with crowdsourced re-annotation of a POS tagging data set
Crowdsourcing lets us collect multiple annotations for an item from several annotators. Typically, these are annotations for non-sequential classification tasks. While there has been some work on crowdsourcing named entity annotations, researchers have largely assumed that syntactic tasks such as part-of-speech (POS) tagging cannot be crowdsourced. This paper shows that workers can actually ann...
متن کاملFacilitating Reconciliation of Inter-Annotator Disagreements
Development and evaluation of Natural Language Processing methods often requires text annotation. To gauge the difficulty of the task and increase the reliability and quality of annotations, researchers often recruit at least two annotators. The discrepancies in annotations by multiple annotators need to be identified and reconciled. We present a tool that identifies and helps reconciling and v...
متن کاملEffectively Crowdsourcing Radiology Report Annotations
Crowdsourcing platforms are a popular choice for researchers to gather text annotations quickly at scale. We investigate whether crowdsourced annotations are useful when the labeling task requires medical domain knowledge. Comparing a sentence classification model trained with expert-annotated sentences to the same model trained on crowd-labeled sentences, we find the crowdsourced training data...
متن کاملRobust Online Gesture Recognition with Crowdsourced Annotations
Crowdsourcing is a promising way to reduce the effort of collecting annotations for training gesture recognition systems. Crowdsourced annotations suffer from ”noise” such as mislabeling, or inaccurate identification of start and end time of gesture instances. In this paper we present SegmentedLCSS and WarpingLCSS, two template-matching methods offering robustness when trained with noisy crowds...
متن کاملSeparate or joint? Estimation of multiple labels from crowdsourced annotations
Artificial intelligence techniques aimed at more naturally simulating human comprehension fit the paradigm of multi-label classification. Generally, an enormous amount of high-quality multi-label data is needed to form a multi-label classifier. The creation of such datasets is usually expensive and timeconsuming. A lower cost way to obtain multi-label datasets for use with such comprehension–si...
متن کامل