User-Focused Multi-Document Summarization with Paragraph Clustering and Sentence-Type Filtering

نویسندگان

  • Yohei Seki
  • Koji Eguchi
  • Noriko Kando
چکیده

Applying document clustering techniques to multidocument summarization is a challenging problem, mostly because of the redundancy that exists in multiple sources. We compare several document clustering techniques for multi-document summarization in the NTCIR-4 TSC test collection. We conducted an experiment to evaluate the effectiveness of reducing redundancy in the production of summaries. From the results, we draw conclusions regarding the nature of the multi-document summarization with respect to redundancy reduction strategies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Latent semantic sentence clustering for multi-document summarization

This thesis investigates the applicability of Latent Semantic Analysis (LSA) to sentence clustering for Multi-Document Summarization (MDS). In contrast to more shallow approaches like measuring similarity of sentences by word overlap in a traditional vector space model, LSA takes word usage patterns into account. So far LSA has been successfully applied to different Information Retrieval (IR) t...

متن کامل

SIMBA: An Extractive Multi-document Summarization System for Portuguese

This is a proposal for demonstration of simba in PROPOR 2012. simba is an extractive multi-document summarization system that aims at producing generic summaries guided by a compression rate defined by the user. It uses a double-clustering approach to find the relevant information in a set of texts. In addition, simba uses a sentence simplification procedure as a mean to ensure summary compress...

متن کامل

Concept Frequency: A Feature Set Based Text Compression Model

A summary is a shorter version of the original. Such a simplification highlights the major points from the much longer subject, such as a text, speech, film, or event. The purpose is to help the audience get the gist in a short period of time. Automatic summarization involves reducing a text document or a larger corpus of multiple documents into a short set of words or paragraph that conveys th...

متن کامل

Sentence Clustering-based Summarization of Multiple Text Documents

With the rapid growth of the World Wide Web, information overload is becoming a problem for an increasingly large number of people. Automatic Multidocument summarization can be an indispensable solution to reduce the information overload problem on the web. This kind of summarization facility helps users to see at a glance what a collection is about and provides a new way of managing a vast hoa...

متن کامل

Graph-based models for multi-document summarization

University of Ljubljana Faculty of Computer and Information Science Ercan Canhasi Graph-based models for multi-document summarization is thesis is about automatic document summarization, with experimental results on general, query, update and comparative multi-document summarization (MDS). We describe prior work and our own improvements on some important aspects of a summarization system, incl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004