To build text summaries of high quality, nuclearity is not sufficient
نویسنده
چکیده
Researchers in discourse have long hypothesized that the nuclei of a rhetorical structure tree provide a good summary of the text for which that tree was built. In this paper, I discuss a psychohnguistic experiment that validates this hypothesis, but that also shows that the distinction between nuclei and satelhtes is not sufficient if we want to build summaries of very high quality. I empirically compare various techniques for mapping discourse trees into partial orders that reflect the importance of the elementary textual units in texts and I discuss both their strengths and weaknesses.
منابع مشابه
Bridging the gap between extractive and abstractive summaries: Creation and evaluation of coherent extracts from heterogeneous sources
Coherent extracts are a novel type of summary combining the advantages of manually created abstractive summaries, which are fluent but difficult to evaluate, and low-quality automatically created extractive summaries, which lack coherence and structure. We use a corpus of heterogeneous documents to address the issue that information seekers usually face – a variety of different types of informa...
متن کاملSystematic literature review of fuzzy logic based text summarization
Information Overloadrq is not a new term but with the massive development in technology which enables anytime, anywhere, easy and unlimited access; participation & publishing of information has consequently escalated its impact. Assisting userslq informational searches with reduced reading surfing time by extracting and evaluating accurate, authentic & relevant information are the primary c...
متن کاملEXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS
Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...
متن کاملMix Multiple Features to Evaluate the Content and the Linguistic Quality of Text Summaries
In this article, we propose a method of text summary's content and linguistic quality evaluation that is based on a machine learning approach. This method operates by combining multiple features to build predictive models that evaluate the content and the linguistic quality of new summaries (unseen) constructed from the same source documents as the summaries used in the training and the validat...
متن کاملA survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کامل