Generic Multi-Document Summarization Using Topic-Oriented Information

نویسندگان

  • Yulong Pei
  • Wenpeng Yin
  • Lian'en Huang
چکیده

The graph-based ranking models have been widely used for multi-document summarization recently. By utilizing the correlations between sentences, the salient sentences can be extracted according to the ranking scores. However, sentences are treated in a uniform way without considering the topic-level information in traditional methods. This paper proposes the topic-oriented PageRank (ToPageRank) model, in which topic information is fully incorporated, and the topic-oriented HITS (ToHITS) model is designed to compare the influence of different graph-based algorithms. We choose the DUC2004 data set to examine the models. Experimental results demonstrate the effectiveness of ToPageRank. And the results also show that ToPageRank is more effective and robust than other models including ToHIST under different evaluation metrics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NEO-CORTEX: A Performant User-Oriented Multi-Document Summarization System

This paper discusses an approach to topic-oriented multidocument summarization. It investigates the effectiveness of using additional information about the document set as a whole, as well as individual documents. We present NEO-CORTEX, a multi-document summarization system based on the existing CORTEX system. Results are reported for experiments with a document base formed by the NIST DUC-2005...

متن کامل

Multi-topic Based Query-Oriented Summarization

Query-oriented summarization aims at extracting an informative summary from a document collection for a given query. It is very useful to help users grasp the main information related to a query. Existing work can be mainly classified into two categories: supervised method and unsupervised method. The former requires training examples, which makes the method limited to predefined domains. While...

متن کامل

Analysis of Multi-Document Viewpoint Summarization Using Multi-Dimensional Genres

An interactive information retrieval system that provides different types of summaries of retrieved documents according to each user’s information needs can be effective for understanding the contents. The purpose of this study is to build a multi-document summarizer to produce summaries according to such viewpoints. As an exploratory stage of investigation, we examined the effectiveness of gen...

متن کامل

Multi-Document Arabic Summarization Using Text Clustering to Reduce Redundancy

“The process of multi-document summarization is producing a single summary of a collection of related documents. In this work we focus on generic extractive Arabic multi-document summarizers. We also describe the cluster approach for multi-document summarization. The problem with multi-document text summarization is redundancy of sentences, and thus, redundancy must be eliminated to ensure cohe...

متن کامل

Affinity-Preserving Random Walk for Multi-Document Summarization

Multi-document summarization provides users with a short text that summarizes the information in a set of related documents. This paper introduces affinitypreserving random walk to the summarization task, which preserves the affinity relations of sentences by an absorbing random walk model. Meanwhile, we put forward adjustable affinity-preserving random walk to enforce the diversity constraint ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012