نتایج جستجو برای: corpora creation

تعداد نتایج: 147847  

2004
Kiril Ivanov Simov Alexander Simov Hristo Ganev Krassimira Ivanova Ilko Grigorov

The paper presents the CLaRK System as a tool for the creation of XML-based corpora and a platform for rapid prototyping. The system provides a set of basic tools for processing XML documents. These tools include: tokenizers, regular grammars, constraints; remove, insert, extract, sort, transformation operations. Additionally, the system is equipped with a macro language which allows the creati...

2009
Helena M. Caseli Tiago F. Pereira Lucia Specia Thiago A. S. Pardo Caroline Gasperin Sandra M. Aluisio

In this paper we address the problem of building the necessary tools and resources for performing Brazilian Portuguese text simplification. We describe our efforts on the design and development of: (a) a XCES-based annotation schema, (b) an annotation edition tool, and (c) a portal to access parallel corpora of original-simplified texts. These contributions were intended to (i) allow the creati...

Journal: :The Boston Medical and Surgical Journal 1875

Journal: :Polibits 2010
Mikhail Mikhailov Hannu Tommola Nina Isolahti

issue of Polibits includes a selection of papers related to the topic of processing of semantic information. Processing of semantic information involves usage of methods and technologies that help machines to understand the meaning of information. These methods automatically perform analysis, extraction, generation, interpretation, and annotation of information contained on the Web, corpus, nat...

Journal: :JLCL 2009
Marina Santini Serge Sharoff

The project discussed in this article focuses on the creation of web genre benchmarks (a.k.a. web genre reference corpora or web genre test collections), i.e. newly conceived test collections against which it will be possible to judge the performance of future genre-enabled web applications. The creation of web genre benchmarks is of key importance for the next generation of web applications be...

2003
Milena Slavcheva

The creation of linguistically interpreted corpora is a tedious task and the automation of the annotation process is indispensable. A fully automated annotation is hardly possible to achieve, since it requires very sophisticated and large knowledge bases which are, themselves difficult to create. However, a "machine-aided approach" to the annotating process, using as many as available sources o...

2001
Kiril Simov Alexander Simov Milen Kouylekov Krassimira Ivanova

In this paper we describe the architecture and the intended applications of the CLaRK System. The development of the CLaRK System started under the T ubingen-So a International Graduate Programme in Computational Linguistics and Represented Knowledge (CLaRK). The main aim behind the design of the system is the minimization of human intervention during the creation of corpora. Creation of corpo...

Journal: :Domínios de Lingu@gem 2019

Journal: :English Today 2003

2013
Laurence Anthony

Anthony, Laurence. 2013. A critical look at software tools in corpus linguistics. Linguistic Research 30(2), 141-161. Corpora are often referred to as the ‘tools’ of corpus linguistics. However, it is important to recognize that corpora are simply linguistic data and that specialized software tools are required to view and analyze them. The functionality offered by software tools largely dictat...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید