نتایج جستجو برای: text cohesion
تعداد نتایج: 177404 فیلتر نتایج به سال:
A method is presented for segmenting text into subtopic areas. The proportion of related pairwise words is calculated between adjacent windows of text to determine their lexical similarity. The lexical cohesion relations of reiteration and collocation are used to identify related words. These relations are automatically located using a combination of three linguistic features: word repetition, ...
In this paper we describe a method to obtain summaries focussed on chosen characters of a free text. Summaries are extracted from discourse structures, which resemble rhetorical trees. They are obtained by exploiting cohesion and coherence properties of the text. Evaluation intends to evidence the contribution of each module in the final result.
A probabilistic segment model combining lexical cohesion and disruption for topic segmentation Identifying topical structure in any text-like data is a challenging task. Most existing techniques rely either on maximizing a measure of the lexical cohesion or on detecting lexical disruptions. A novel method combining the two criteria so as to obtain the best trade-off between cohesion and disrupt...
ÐA robust system is proposed to automatically detect and extract text in images from different sources, including video, newspapers, advertisements, stock certificates, photographs, and checks. Text is first detected using multiscale texture segmentation and spatial cohesion constraints, then cleaned up and extracted using a histogram-based binarization algorithm. An automatic performance evalu...
The technology of automatic document summarization is maturing and may provide a solution to the information overload problem. Nowadays, document summarization plays an important role in information retrieval. With a large volume of documents, presenting the user with a summary of each document greatly facilitates the task of finding the desired documents. Document summarization is a process of...
This paper investigates the influence of discourse features on text complexity assessment. To do so, we created two data sets based on the Penn Discourse Treebank and the Simple English Wikipedia corpora and compared the influence of coherence, cohesion, surface, lexical and syntactic features to assess text complexity. Results show that with both data sets coherence features are more correlate...
cohesion is an indispensable linguistic feature in discourse analysis. lexicald such a differe cohesion and conjunction in particular as two crucial elements to textual cohesion and comprehension has been the focus of a wide range of studies up to now. yet the relationship between the open register and cohesive devices has not been thoroughly investigated in discourse studies. this study concen...
We have compiled a corpus of 80 Dutch texts from expository and persuasive genres, which we annotated for rhetorical and genre-specific discourse structure, and lexical cohesion with the goal of creating a gold standard for further research. The annotations are based on a segmentation of the text in elementary discourse units that takes into account cues from syntax and punctuation. During the ...
In this paper, we describe an n-gram approach to automatically assess essay quality in student writing. Underlying this approach is the development of n-gram indices that examine rhetorical, syntactic, grammatical, and cohesion features of paragraph types (introduction, body, and conclusion paragraphs) and entire essays. For this study, we developed over 300 n-gram indices and assessed their po...
Topic segmentation classically relies on one of two criteria, either finding areas with coherent vocabulary use or detecting discontinuities. In this paper, we propose a segmentation criterion combining both lexical cohesion and disruption, enabling a trade-off between the two. We provide the mathematical formulation of the criterion and an efficient graph based decoding algorithm for topic seg...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید