Multi-document summaries using n-gram graphs: salience and redundancy

نویسندگان

  • George Giannakopoulos
  • George Vouros
  • Vangelis Karkaletsis
چکیده

This paper describes a summarization system that aims to provide a set of languageindependent and generic methods for generating extractive summaries. The proposed methods are realized as operators to a generic character n-gram graph representation of texts, towards the selection of content and removal of redundancy. This work defines the set of generic operators upon n-gram graphs and proposes a number of ways for using these operators within the summarization process. The experimental results, performed upon widely used corpora from the Document Understanding and the Text Analysis Conferences, are promising, providing evidence for the potential of the generic methods introduced. 2 George Giannakopoulos, George Vouros, Vangelis Karkaletsis

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MUDOS-NG: Multi-document Summaries Using N-gram Graphs (Tech Report)

This report describes the MUDOS-NG summarization system, which applies a set of language-independent and generic methods for generating extractive summaries. The proposed methods are mostly combinations of simple operators on a generic character n-gram graph representation of texts. This work defines the set of used operators upon n-gram graphs and proposes using these operators within the mult...

متن کامل

Automatic Summarization from Multiple Documents

This work reports on research conducted on the domain of multi-document summarization using background knowledge. The research focuses on summary evaluation and the implementation of a set of generic use tools for NLP tasks and especially for automatic summarization. Within this work we formalize the n-gram graph representation and its use in NLP tasks. We present the use of n-gram graphs for t...

متن کامل

Cascaded Attention based Unsupervised Information Distillation for Compressive Summarization

When people recall and digest what they have read for writing summaries, the important content is more likely to attract their attention. Inspired by this observation, we propose a cascaded attention based unsupervised model to estimate the salience information from the text for compressive multi-document summarization. The attention weights are learned automatically by an unsupervised data rec...

متن کامل

Entity type modeling for multi-document summarization : generating descriptive summaries of geo-located entities

In this work we investigate the application of entity type models in extractive multi-document summarization using the automatic caption generation for images of geo-located entities (e.g. Westminster Abbey, Loch Ness, Eiffel Tower) as an application scenario. Entity type models contain sets of patterns aiming to capture the ways the geo-located entities are described in natural language. They ...

متن کامل

Multi-Document Summarization Model Based on Integer Linear Programming

This paper proposes an extractive generic text summarization model that generates summaries by selecting sentences according to their scores. Sentence scores are calculated using their extensive coverage of the main content of the text, and summaries are created by extracting the highest scored sentences from the original document. The model formalized as a multiobjective integer programming pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009