نتایج جستجو برای: text correction

تعداد نتایج: 328701  

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

Journal: :Language Resources and Evaluation 2015
John Lee Chak Yan Yeung Amir Zeldes Marc Reznicek Anke Lüdeling Jonathan Webster

Learner corpora consist of texts produced by non-native speakers. In addition to these texts, some learner corpora also contain error annotations, which can reveal common errors made by language learners, and provide training material for automatic error correction. We present a novel type of error-annotated learner corpus containing sequences of revised essay drafts written by non-native speak...

Journal: :Proceedings of the AAAI Conference on Artificial Intelligence 2020

Journal: :CoRR 2007
Mike Tian-Jian Jiang James Zhan Jaimie Lin Jerry Lin Wen-Lien Hsu

In this paper, we propose an automated evaluation metric for text entry. We also consider possible improvements to existing text entry evaluation metrics, such as the minimum string distance error rate, keystrokes per character, cost per correction, and a unified approach proposed by MacKenzie, so they can accommodate the special characteristics of Chinese text. Current methods lack an integrat...

H. Ghasemzadeh H. Mehrara M. Tajik Khasss

Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...

Journal: :CoRR 2006
Fabio G. Guerrero Lucio A. Perez

In this paper, a tutorial software to learn Information Theory basics in a practical way is reported. The software, called IT-tutor-UV, makes use of a modern existing Spanish corpus for the modeling of the source. Both the source and the channel coding are also included in this educational tool as part of the learning experience. Entropy values of the Spanish language obtained with the IT-tutor...

2012
Ross Israel Joel R. Tetreault Martin Chodorow

While the field of grammatical error detection has progressed over the past few years, one area of particular difficulty for both native and non-native learners of English, comma placement, has been largely ignored. We present a system for comma error correction in English that achieves an average of 89% precision and 25% recall on two corpora of unedited student essays. This system also achiev...

2011
Michal Marcinczuk Monika Zasko-Zielinska Maciej Piasecki

Polish Corpus of Suicide Notes (henceforth PCSN) is constructed to meet the needs of forensic linguistics. Suicide notes are messages created in borderline situation, shortly before death. Hence the annotation schema requires a complex description of a document structure, the textual content, as well as its linguistic properties. TEI was selected as the basis for the document encoding schema. T...

2014
M. Pilar Valverde Ibañez Akira Ohtani

Annotating a corpus with error information is a challenging task. This paper describes the design, evaluation and refinement of an annotation scheme for Spanish article errors in learner data, so that future work on corpus annotation and automatic article error detection can progress. To evaluate reliability, 300 noun phrases with definite, indefinite and zero article have been tagged by four a...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید