نتایج جستجو برای: learner corpus

تعداد نتایج: 81699  

2016
Elisa Corino Claudio Russo

English. Modern learner corpora are now routinely PoS tagged, whereas syntactic parsing is much less frequent. This paper proposes a first attempt of parsing applied to a subcorpus of VALICO, in an effort to identify key elements to be further used to parse corpora of Italian as a foreign language in

2012
Julian Brooke Graeme Hirst

The task of native language (L1) identification suffers from a relative paucity of useful training corpora, and standard within-corpus evaluation is often problematic due to topic bias. In this paper, we introduce a method for L1 identification in second language (L2) texts that relies only on much more plentiful L1 data, rather than the L2 texts that are traditionally used for training. In par...

2010
Barbora Štindlová

One of the challenges of contemporary corpus linguistics is the compilation and annotation of corpora consisting of texts produced by non-native speakers. In addition to morphosyntactic tagging and lemmatisation, such texts can be annotated by information relevant to the specific nonstandard use. Cases of deviant language use can be corrected and identified by a tag specifying the type of the e...

2016
Andrea Abel Aivars Glaznieks Lionel Nicolas Egon Stemle

English. This paper describes an extended version of the KoKo corpus (version KoKo4, Dec 2015), a corpus of written German L1 learner texts from three different German-speaking regions in three different countries. The KoKo corpus is richly annotated with learner language features on different linguistic levels such as errors or other linguistic characteristics that are not deficit-oriented, an...

2010
Heike Zinsmeister Margit Breckle

Learner corpora consist of texts produced by second language (L2) learners. I We present ALeS Ko, a learner corpus of Chinese L2 learners of German and discuss the multi-layer annotation of the left sentence periphery notably the Vorfeld.

2016
Amália Mendes Sandra Antunes Maarten Janssen Anabela Gonçalves

We present the COPLE2 corpus, a learner corpus of Portuguese that includes written and spoken texts produced by learners of Portuguese as a second or foreign language. The corpus includes at the moment a total of 182,474 tokens and 978 texts, classified according to the CEFR scales. The original handwritten productions are transcribed in TEI compliant XML format and keep record of all the origi...

Journal: :International Journal of Learner Corpus Research 2016

2010
Adriane Boyd

This paper describes the Error-Annotated German Learner Corpus (EAGLE), a corpus of beginning learner German with grammatical error annotation. The corpus contains online workbook and and hand-written essay data from learners in introductory German courses at The Ohio State University. We introduce an error typology developed for beginning learners of German that focuses on linguistic propertie...

Journal: :IJCALLT 2014
Trude Heift Catherine Caws

This paper discusses a data-driven learning (DDL) tool, which consists of a learner corpus for L2 learners of German. The learner corpus, in addition to submissions from ongoing current users, has been constructed from millions of submissions from a variety of activity types of approximately 5000 learners who used the E-Tutor CALL system over a period of five years. By following a cyclical proc...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید