نتایج جستجو برای: corpora creation
تعداد نتایج: 147847 فیلتر نتایج به سال:
The paper presents the CLaRK System as a tool for the creation of XML-based corpora and a platform for rapid prototyping. The system provides a set of basic tools for processing XML documents. These tools include: tokenizers, regular grammars, constraints; remove, insert, extract, sort, transformation operations. Additionally, the system is equipped with a macro language which allows the creati...
In this paper we address the problem of building the necessary tools and resources for performing Brazilian Portuguese text simplification. We describe our efforts on the design and development of: (a) a XCES-based annotation schema, (b) an annotation edition tool, and (c) a portal to access parallel corpora of original-simplified texts. These contributions were intended to (i) allow the creati...
issue of Polibits includes a selection of papers related to the topic of processing of semantic information. Processing of semantic information involves usage of methods and technologies that help machines to understand the meaning of information. These methods automatically perform analysis, extraction, generation, interpretation, and annotation of information contained on the Web, corpus, nat...
The project discussed in this article focuses on the creation of web genre benchmarks (a.k.a. web genre reference corpora or web genre test collections), i.e. newly conceived test collections against which it will be possible to judge the performance of future genre-enabled web applications. The creation of web genre benchmarks is of key importance for the next generation of web applications be...
The creation of linguistically interpreted corpora is a tedious task and the automation of the annotation process is indispensable. A fully automated annotation is hardly possible to achieve, since it requires very sophisticated and large knowledge bases which are, themselves difficult to create. However, a "machine-aided approach" to the annotating process, using as many as available sources o...
In this paper we describe the architecture and the intended applications of the CLaRK System. The development of the CLaRK System started under the T ubingen-So a International Graduate Programme in Computational Linguistics and Represented Knowledge (CLaRK). The main aim behind the design of the system is the minimization of human intervention during the creation of corpora. Creation of corpo...
Anthony, Laurence. 2013. A critical look at software tools in corpus linguistics. Linguistic Research 30(2), 141-161. Corpora are often referred to as the ‘tools’ of corpus linguistics. However, it is important to recognize that corpora are simply linguistic data and that specialized software tools are required to view and analyze them. The functionality offered by software tools largely dictat...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید