Entity-driven Rewrite for Multi-document Summarization
نویسنده
چکیده
In this paper we explore the benefits from and shortcomings of entity-driven noun phrase rewriting for multidocument summarization of news. The approach leads to 20% to 50% different content in the summary in comparison to an extractive summary produced using the same underlying approach, showing the promise the technique has to offer. In addition, summaries produced using entity-driven rewrite have higher linguistic quality than a comparison non-extractive system. Some improvement is also seen in content selection over extractive summarization as measured by pyramid method evaluation. Disciplines Computer Sciences Comments Nenkova, A., Entity-Driven Rewrite for Multi-Document Summarization, 3rd International Joint Conference on Natural Language Processing, 2008 This conference paper is available at ScholarlyCommons: http://repository.upenn.edu/cis_papers/730 Entity-driven Rewrite for Multi-document Summarization Ani Nenkova University of Pennsylvania Department of Computer and Information Science [email protected]
منابع مشابه
Multiple Aspect Summarization Using Integer Linear Programming
Multi-document summarization involves many aspects of content selection and surface realization. The summaries must be informative, succinct, grammatical, and obey stylistic writing conventions. We present a method where such individual aspects are learned separately from data (without any hand-engineering) but optimized jointly using an integer linear programme. The ILP framework allows us to ...
متن کاملMulti-Document Summarization Using Document Set Type Classification
In this paper, we propose a summarization system which automatically classifies type of document set and summarizes a document set with its appropriate summarization mechanism. This system will classify a document set into three types: (a) One topic type, (b) multi-topic type, and (c) others. These types will be identified using information of high frequency nouns and Named Entity. In our multi...
متن کاملGlobal and Local Models for Multi-Document Summarization
In this paper we study the effectiveness of combining corpus-level (global) tag-topic models and target document set level local models for multi-document summarization. Recently tag-topic models that exploit both word level annotation (e.g. named entity type) and/or document level metadata (e.g. words related to topic categories) have been proposed to model documents tagged from two different ...
متن کاملEntity type modeling for multi-document summarization : generating descriptive summaries of geo-located entities
In this work we investigate the application of entity type models in extractive multi-document summarization using the automatic caption generation for images of geo-located entities (e.g. Westminster Abbey, Loch Ness, Eiffel Tower) as an application scenario. Entity type models contain sets of patterns aiming to capture the ways the geo-located entities are described in natural language. They ...
متن کاملAn Entity-Focused Approach to Generating Company Descriptions
Finding quality descriptions on the web, such as those found in Wikipedia articles, of newer companies can be difficult: search engines show many pages with varying relevance, while multi-document summarization algorithms find it difficult to distinguish between core facts and other information such as news stories. In this paper, we propose an entity-focused, hybrid generation approach to auto...
متن کامل