نتایج جستجو برای: paper based texts

تعداد نتایج: 4016250  

2008
Clémentine Adam Estelle Delpech Patrick Saint-Dizier

In this paper, we present an analysis based on linguistic and typographic features that allows for the identification of titles in web documents. We focus in particular on procedural texts. Identifying texts is a difficult task because ways pf encoding them are very diverse. A number of titles are also incomplete because fo context, we propose also a way to retrieve the missing elements, in par...

2014
Juyeon Kang Patrick Saint-Dizier

In this paper, we investigate and experiment the notion of error correction memory applied to error correction in technical texts. The main purpose is to induce relatively generic correction patterns associated with more contextual correction recommendations, based on previously memorized and analyzed corrections. The notion of error correction memory is developed within the framework of the LE...

2013
Lakshmi Ramachandran Edward F. Gehringer

Review quality is determined by identifying the relevance of a review to a submission (the article or paper the review was written for). We identify relevance in terms of the semantic and syntactic similarities between two texts. We use a word order graph, whose vertices, edges and double edges help determine structure-based match across texts. We use WordNet to determine semantic relatedness. ...

2011
Nynke van der Vliet Ildikó Berzlánovich Gosse Bouma Markus Egg Gisela Redeker

We are compiling a corpus of Dutch texts annotated with discourse structure and lexical cohesion, containing initially 80 texts from expository and persuasive genres. We are using this resource for corpus-based studies of discourse relations, discourse markers, cohesion, and genre differences. We are also exploring the possibilities of automatic text segmentation and semi-automatic discourse an...

2013
Hossein Kardan Moghaddam

The horizontal segmentation of handwritten text lines is a key step to detect handwritten texts has slant. In this paper, a novel method is proposed based on the fuzzy triangles to bring together and connecting the text lines. This proposed method has been tested on data banks in Chinese languages. In the experiments on the Chinese handwritten texts, a performance of 94.53% was obtained. Abbrev...

2012
Frank Serafini FRANK SERAFINI

Discussions concerning which literacy skills will be required of students in the 21st century have appeared in numerous educational publications recently and have been greeted with mixed reactions (Bellanca & Brandt, 2010; Trilling & Fadel, 2009). It has been proposed that the skills necessary to be a literate citizen in the new millennium have expanded from simply being able to read and write ...

ژورنال: محاسبات نرم 2013

With the increasingly growth of scientific documents in the Web, it is difficult to select a concerned document. A citation recommendation system receives a text and recommends documents to be cited by the text. Such recommendation helps a researcher in hitting his/her concerned texts. Based on sematic relations, this paper presents a new indicator to measure the similarity between documents an...

2010
Yulia Tsvetkov Shuly Wintner

Parallel corpora are indispensable resources for a variety of multilingual natural language processing tasks. This paper presents a technique for fully automatic construction of constantly growing parallel corpora. We propose a simple and effective dictionary-based algorithm to extract parallel document pairs from a large collection of articles retrieved from the Internet, potentially containin...

2002
Sanda M. Harabagiu Steven J. Maiorano

This paper describes a methodology of answering questions by using information retrieved from very large collections of texts. We argue that combinations of information retrieval and extractions techniques cannot be used, due to the open-domain nature of the task. We propose a solution based on indexing techniques that identify paragraphs from texts where the answers can be found. The validity ...

2015
Bastian Entrup

This paper introduces an open-source Java-package called German Language Processing for Lucene (glp4lucene). Although it was originally developed to work with German texts, it is to a large degree language independent. It aims at facilitating four language processing steps for working with non-English texts and Apache Lucene/Solr: lemmatizing words, weighting terms based on their part-of-speech...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید