Sentence Position revisited: A robust light-weight Update Summarization baseline Algorithm
نویسندگان
چکیده
In this paper, we describe a sentence position based summarizer that is built based on a sentence position policy, created from the evaluation testbed of recent summarization tasks at Document Understanding Conferences (DUC). We show that the summarizer thus built is able to outperform most systems participating in task focused summarization evaluations at Text Analysis Conferences (TAC) 2008. Our experiments also show that such a method would perform better at producing short summaries (upto 100 words) than longer summaries. Further, we discuss the baselines traditionally used for summarization evaluation and suggest the revival of an old baseline to suit the current summarization task at TAC: the Update Summarization task.
منابع مشابه
Update Summarization
Update Summarization is a form of multi-document summarization wherein we generate a summary of a multi-document dataset based on the assumption that the user has already read a given set of documents. In our paper, we present a summarization system which clusters together sentences from the old set based on a semantic similarity score. We then use the centroids of these clusters, along with an...
متن کاملExperimenting with Clause Segmentation for Text Summarization
In this paper, we describe our experiments with clause segmentation in producing summaries for the TAC 2008 Update Summarization Track. The submitted runs were designed to determine if a heuristic clause segmentation applied before sentence selection would improve summarization results by reducing the need for sentence compression approaches. A baseline summariser was used to test this hypothes...
متن کاملارائه سیستم خلاصه ساز متون فارسی برمبنای ویژگی های زبان شناختی و رگرسیون
Considering the vast amount of existing written information and the shortage of time, optimal summarization of books, articles, news reports, etc. on the Web is a major concern of researchers. In this paper, we propose a new approach for Persian single-document Summarization based on several linguistic features of text. In our approach after extracting the linguistic features for each sentence,...
متن کاملThe ICSI/UTD Summarization System at TAC 2009
We describe improvements to our 2008 system that result in a top-performing summarization system. The motivating ideas are (1) improve sentence boundary detection to avoid damaging errors in preprocessing; (2) prune sentences that are unlikely to work well in a summary; (3) leverage sentence position to improve update summarization; (4) focus on high-precision sentence compression to improve re...
متن کاملEvaluation of a Sentence Ranker for Text Summarization Based on Roget's Thesaurus
Evaluation is one of the hardest tasks in automatic text summarization. It is perhaps even harder to determine how much a particular component of a summarization system contributes to the success of the whole system. We examine how to evaluate the sentence ranking component using a corpus which has been partially labelled with Summary Content Units. To demonstrate this technique, we apply it to...
متن کامل