نتایج جستجو برای: inter rater reliability

تعداد نتایج: 255783  

Journal: :The Journal of applied psychology 2007
Filip Lievens Juan I Sanchez

A quasi-experiment was conducted to investigate the effects of frame-of-reference training on the quality of competency modeling ratings made by consultants. Human resources consultants from a large consulting firm were randomly assigned to either a training or a control condition. The discriminant validity, interrater reliability, and accuracy of the competency ratings were significantly highe...

2007
Febe de Wet Christa van der Walt Thomas Niesler

We describe first results obtained during the development of an automatic system for the assessment of spoken English proficiency of university students. The ultimate aim of this system is to allow fast, consistent and objective assessment of oral proficiency for the purpose of placing students in courses appropriate to their language skills. Rate of speech (ROS) was chosen as an indicator of f...

2009
Lee M. Christensen Henk Harkema Peter J. Haug Jeannie Yuhaniak Irwin Wendy W. Chapman

This paper introduces ONYX, a sentencelevel text analyzer that implements a number of innovative ideas in syntactic and semantic analysis. ONYX is being developed as part of a project that seeks to translate spoken dental examinations directly into chartable findings. ONYX integrates syntax and semantics to a high degree. It interprets sentences using a combination of probabilistic classifiers,...

2006
Sanaz Jabbari Ben Allison David Guthrie Louise Guthrie

This paper describes the largest scale annotation project involving the Enron email corpus to date. Over 12,500 emails were classified, by humans, into the categories “Business” and “Personal”, and then subcategorised by type within these categories. The paper quantifies how well humans perform on this task (evaluated by inter-annotator agreement). It presents the problems experienced with the ...

2009
Luis M. T. Jesus Anna Barney Ricardo Santos Janine Caetano Juliana Jorge Pedro Sá-Couto

This paper presents Universidade de Aveiro’s Voice Evaluation Protocol for European Portuguese (EP), and a preliminary inter-rater reliability study. Ten patients with vocal pathology were assessed, by two Speech and Language Therapists (SLTs). Protocol parameters such as overall severity, roughness, breathiness, change of loudness (CAPEV), grade, breathiness and strain (GRBAS), glottal attack,...

2017
Jesse Dunietz Lori S. Levin Jaime G. Carbonell

Language of cause and effect captures an essential component of the semantics of a text. However, causal language is also intertwined with other semantic relations, such as temporal precedence and correlation. This makes it difficult to determine when causation is the primary intended meaning. This paper presents BECauSE 2.0, a new version of the BECauSE corpus with exhaustively annotated expre...

2016
Oded Avraham Yoav Goldberg

We suggest a new method for creating and using gold-standard datasets for word similarity evaluation. Our goal is to improve the reliability of the evaluation, and we do this by redesigning the annotation task to achieve higher inter-rater agreement, and by defining a performance measure which takes the reliability of each annotation decision in the dataset into account.

Journal: :Medical teacher 2010
Salah Eldin Kassab Shereen Hussain

BACKGROUND In the problem-based learning (PBL) medical curriculum at the Arabian Gulf University in Bahrain, students construct concept maps related to each case they study in PBL tutorials. AIM To evaluate the interrater reliability and predictive validity of concept map scores using a structured assessment tool. METHODS We examined concept maps of the same cohort of students at the beginn...

2016
Steven Bethard Jonathan Parker

We present a new annotation scheme for normalizing time expressions, such as three days ago, to computer-readable forms, such as 2016-03-07. The annotation scheme addresses several weaknesses of the existing TimeML standard, allowing the representation of time expressions that align to more than one calendar unit (e.g., the past three summers), that are defined relative to events (e.g., three w...

Journal: :Assessment 2003
Richard Rogers Rebecca L Jackson Kenneth W Sewell Chad E Tillbrook Mary A Martin

Four decades of forensic research have left unanswered a fundamental issue regarding the best conceptualization of competency to stand trial vis-à-vis the Dusky standard. The current study investigated three competing models (discrete abilities, domains, and cognitive complexity) on combined data (N = 411) from six forensic and correctional samples. Using the Evaluation of Competency to Stand T...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید