corpora creation

نتایج جستجو برای: corpora creation

تعداد نتایج: 147847 فیلتر نتایج به سال:

The CLaRK System: XML-based Corpora Development System for Rapid Prototyping

2004

Kiril Ivanov Simov Alexander Simov Hristo Ganev Krassimira Ivanova Ilko Grigorov

The paper presents the CLaRK System as a tool for the creation of XML-based corpora and a platform for rapid prototyping. The system provides a set of basic tools for processing XML documents. These tools include: tokenizers, regular grammars, constraints; remove, insert, extract, sort, transformation operations. Additionally, the system is equipped with a macro language which allows the creati...

متن کامل

Building a Brazilian Portuguese Parallel Corpus of Original and Simplified Texts

2009

Helena M. Caseli Tiago F. Pereira Lucia Specia Thiago A. S. Pardo Caroline Gasperin Sandra M. Aluisio

In this paper we address the problem of building the necessary tools and resources for performing Brazilian Portuguese text simplification. We describe our efforts on the design and development of: (a) a XCES-based annotation schema, (b) an annotation edition tool, and (c) a portal to access parallel corpora of original-simplified texts. These contributions were intended to (i) allow the creati...

متن کامل

Turbinated Corpora Cavernosa

Journal: :The Boston Medical and Surgical Journal 1875

متن کامل

Spoken to Spoken vs. Spoken to Written: Corpus Approach to Exploring Interpreting and Subtitling

Journal: :Polibits 2010

Mikhail Mikhailov Hannu Tommola Nina Isolahti

issue of Polibits includes a selection of papers related to the topic of processing of semantic information. Processing of semantic information involves usage of methods and technologies that help machines to understand the meaning of information. These methods automatically perform analysis, extraction, generation, interpretation, and annotation of information contained on the Web, corpus, nat...

متن کامل

Web Genre Benchmark Under Construction

Journal: :JLCL 2009

Marina Santini Serge Sharoff

The project discussed in this article focuses on the creation of web genre benchmarks (a.k.a. web genre reference corpora or web genre test collections), i.e. newly conceived test collections against which it will be possible to judge the performance of future genre-enabled web applications. The creation of web genre benchmarks is of key importance for the next generation of web applications be...

متن کامل

Integrating a Verb Lexicon into a Syntactic Treebank Production

2003

Milena Slavcheva

The creation of linguistically interpreted corpora is a tedious task and the automation of the annotation process is indispensable. A fully automated annotation is hardly possible to achieve, since it requires very sophisticated and large knowledge bases which are, themselves difficult to create. However, a "machine-aided approach" to the annotating process, using as many as available sources o...

متن کامل

CLaRK - an XML-based System for Corpora Development

2001

Kiril Simov Alexander Simov Milen Kouylekov Krassimira Ivanova

In this paper we describe the architecture and the intended applications of the CLaRK System. The development of the CLaRK System started under the T ubingen-So a International Graduate Programme in Computational Linguistics and Represented Knowledge (CLaRK). The main aim behind the design of the system is the minimization of human intervention during the creation of corpora. Creation of corpo...

متن کامل

Corpora na Tradução

Journal: :Domínios de Lingu@gem 2019

متن کامل

Corpora and crises

Journal: :English Today 2003

متن کامل

A critical look at software tools in corpus linguistics*1

2013

Laurence Anthony

Anthony, Laurence. 2013. A critical look at software tools in corpus linguistics. Linguistic Research 30(2), 141-161. Corpora are often referred to as the ‘tools’ of corpus linguistics. However, it is important to recognize that corpora are simply linguistic data and that specialized software tools are required to view and analyze them. The functionality offered by software tools largely dictat...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید