Integrating cohesion and coherence for Automatic Summarization
نویسندگان
چکیده
This paper presents the integration of cohesive properties of text with coherence relations, to obtain an adequate representation of text for automatic summarization. A summarizer based on Lexical Chains is enchanced with rhetorical and argumentative structure obtained via Discourse Markers. When evaluated with newspaper corpus, this integration yields only slight improvement in the resulting summaries and cannot beat a dummy baseline consisting of the first sentence in the document. Nevertheless, we argue that this approach relies on basic linguistic mechanisms and is therefore genreindependent.
منابع مشابه
Cohesion and coherence for Automatic Summarization
This paper presents the integration of cohesive properties of text with coherence relations, to obtain an adequate representation of text for automatic summarization. A summarizer based on Lexical Chains is enchanced with rhetorical and argumentative structure obtained via Discourse Markers. When evaluated with newspaper corpus, this integration yields only slight improvement in the resulting s...
متن کاملUsing Cohesion and Coherence Models for Text Summarization
In this paper we investigate two classes of techniques to determine what is salient in a text, as a means of deciding whether that information should be included in a summary. We introduce three methods based on text cohesion, which models text in terms of relations between words or referring expressions, to help determine how tightly connected the text is. We also describe a method based on te...
متن کاملGenerating Indicative-Informative Summaries with SumUM
s are texts used in tasks such as assessing the content of the document and deciding if the source is worth reading. If text summarization systems are designed to fulfil those requirements, the quality of the generated texts has to be evaluated according to their intended function. The quality of human-produced abstracts has been examined in the literature (Grant, 1992; Kaplan et al., 1994; Gib...
متن کاملLexical cohesion, discourse segmentation and document summarization
Summaries automatically derived by sentence extraction are known to exhibit some coherence degradation, readability deterioration, and topical under-representation. We propose a strategy for improving upon these problems, aiming to generate more cohesive summaries by analyzing the lexical cohesion factors in the source document texts. As an initial experiment, we have looked at one particular f...
متن کاملA survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کامل