Sentence Position revisited: A robust light-weight Update Summarization baseline Algorithm

نویسندگان

  • Rahul Katragadda
  • Prasad Pingali
  • Vasudeva Varma
چکیده

In this paper, we describe a sentence position based summarizer that is built based on a sentence position policy, created from the evaluation testbed of recent summarization tasks at Document Understanding Conferences (DUC). We show that the summarizer thus built is able to outperform most systems participating in task focused summarization evaluations at Text Analysis Conferences (TAC) 2008. Our experiments also show that such a method would perform better at producing short summaries (upto 100 words) than longer summaries. Further, we discuss the baselines traditionally used for summarization evaluation and suggest the revival of an old baseline to suit the current summarization task at TAC: the Update Summarization task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Update Summarization

Update Summarization is a form of multi-document summarization wherein we generate a summary of a multi-document dataset based on the assumption that the user has already read a given set of documents. In our paper, we present a summarization system which clusters together sentences from the old set based on a semantic similarity score. We then use the centroids of these clusters, along with an...

متن کامل

Experimenting with Clause Segmentation for Text Summarization

In this paper, we describe our experiments with clause segmentation in producing summaries for the TAC 2008 Update Summarization Track. The submitted runs were designed to determine if a heuristic clause segmentation applied before sentence selection would improve summarization results by reducing the need for sentence compression approaches. A baseline summariser was used to test this hypothes...

متن کامل

ارائه سیستم خلاصه ساز متون فارسی برمبنای ویژگی های زبان شناختی و رگرسیون

Considering the vast amount of existing written information and the shortage of time, optimal summarization of books, articles, news reports, etc. on the Web is a major concern of researchers. In this paper, we propose a new approach for Persian single-document Summarization based on several linguistic features of text. In our approach after extracting the linguistic features for each sentence,...

متن کامل

The ICSI/UTD Summarization System at TAC 2009

We describe improvements to our 2008 system that result in a top-performing summarization system. The motivating ideas are (1) improve sentence boundary detection to avoid damaging errors in preprocessing; (2) prune sentences that are unlikely to work well in a summary; (3) leverage sentence position to improve update summarization; (4) focus on high-precision sentence compression to improve re...

متن کامل

Evaluation of a Sentence Ranker for Text Summarization Based on Roget's Thesaurus

Evaluation is one of the hardest tasks in automatic text summarization. It is perhaps even harder to determine how much a particular component of a summarization system contributes to the success of the whole system. We examine how to evaluate the sentence ranking component using a corpus which has been partially labelled with Summary Content Units. To demonstrate this technique, we apply it to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009