Annotation Curricula to Implicitly Train Non-Expert Annotators
نویسندگان
چکیده
Abstract Annotation studies often require annotators to familiarize themselves with the task, its annotation scheme, and data domain. This can be overwhelming in beginning, mentally taxing, induce errors into resulting annotations; especially citizen science or crowdsourcing scenarios where domain expertise is not required. To alleviate these issues, this work proposes curricula, a novel approach implicitly train annotators. The goal gradually introduce task by ordering instances annotated according learning curriculum. do so, formalizes curricula for sentence- paragraph-level tasks, defines an strategy, identifies well-performing heuristics interactively trained models on three existing English datasets. Finally, we provide proof of concept carefully designed user study 40 voluntary participants who are asked identify most fitting misconception tweets about Covid-19 pandemic. results indicate that using simple heuristic order already significantly reduce total time while preserving high quality. thus promising research direction improve collection. facilitate future research—for instance, adapt specific tasks expert scenarios—all code from consisting 2,400 annotations made available.1
منابع مشابه
Creating a linguistic plausibility dataset with non-expert annotators
We describe the creation of a linguistic plausibility dataset that contains annotated examples of language judged to be linguistically plausible, implausible, and every-thing in between. To create the dataset we randomly generate sentences and have them annotated by crowd sourcing over the Amazon Mechanical Turk. Obtaining inter-annotator agreement is a difficult problem because linguistic plau...
متن کاملAnnotation: neurofeedback - train your brain to train behaviour.
BACKGROUND Neurofeedback (NF) is a form of behavioural training aimed at developing skills for self-regulation of brain activity. Within the past decade, several NF studies have been published that tend to overcome the methodological shortcomings of earlier studies. This annotation describes the methodical basis of NF and reviews the evidence base for its clinical efficacy and effectiveness in ...
متن کاملDesign and Evaluation of Shared Prosodic Annotation for Spontaneous French Speech: From Expert Knowledge to Non-Expert Annotation
In the area of large French speech corpora, there is a demonstrated need for a common prosodic notation system allowing for easy data exchange, comparison, and automatic annotation. The major questions are: (1) how to develop a single simple scheme of prosodic transcription which could form the basis of guidelines for non-expert manual annotation (NEMA), used for linguistic teaching and researc...
متن کاملWord Sense Annotation of Polysemous Words by Multiple Annotators
We describe results of a word sense annotation task using WordNet, involving half a dozen well-trained annotators on ten polysemous words for three parts of speech. One hundred sentences for each word were annotated. Annotators had the same level of training and experience, but interannotator agreement (IA) varied across words. There was some effect of part of speech, with higher agreement on n...
متن کاملEvaluating Dialogue Act Tagging with Naive and Expert Annotators
In this paper the dialogue act annotation of naive and expert annotators, both annotating the same data, are compared in order to characterise the insights annotations made by different kind of annotators may provide for evaluating dialogue act tagsets. It is argued that the agreement among naive annotators provides insight in the clarity of the tagset, whereas agreement among expert annotators...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computational Linguistics
سال: 2022
ISSN: ['1530-9312', '0891-2017']
DOI: https://doi.org/10.1162/coli_a_00436