Summarization Evaluation Methods: Experiments and Analysis
نویسندگان
چکیده
Two methods are used for evaluation of summarization systems: an evaluation of generated summaries against an "ideal" summary and evaluation of how well summaries help a person perform in a task such as informa. tion retrieval. We carried out two large experiments to study the two evaluation methods. Our results show that different parameters of an experiment can (h-amatically affect how well a system scores. For example, summary length was found to affect both types of evaluations. For the "ideal" summary based evaluation, accuracy decreases as summary length increases, while for task based evaluations summary length and accuracy on an information retrieval task appear to correlate randomly. In this paper, we show how this parameter and others can affect evaluation results and describe how parameters can be controlled to produce a sound evaluation.
منابع مشابه
Analysis of Summarization Evaluation Experiments
The goals of my dissertation are: 1) to propose a French terminology for the presentation of evaluation results of automatic summaries, 2) to identify and describe experimental variables in evaluations of automatic summaries, 3) to highlight the most common tendencies, inconsistencies and methodological problems in summarization evaluation experiments, and 4) to make recommendations for the pre...
متن کاملA survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملA Comparison of Summarization Methods Based on Task-based Evaluation
A task-based evaluation scheme has been adopted as a new method of evaluation for automatic text summarization systems. It evaluates the performance of a summarization system in a given task, such as information retrieval and text categorization. This paper compares ten different summarization methods based on information retrieval tasks. In order to evaluate the system performance, the subject...
متن کاملExperiments on Semantic-based Clustering for Cross-document Coreference
We describe clustering experiments for cross-document coreference for the first Web People Search Evaluation. In our experiments we apply agglomerative clustering to group together documents potentially referring to the same individual. The algorithm is informed by the results of two different summarization strategies and an offthe-shelf named entity recognition component. We present different ...
متن کاملSentence Position revisited: A robust light-weight Update Summarization baseline Algorithm
In this paper, we describe a sentence position based summarizer that is built based on a sentence position policy, created from the evaluation testbed of recent summarization tasks at Document Understanding Conferences (DUC). We show that the summarizer thus built is able to outperform most systems participating in task focused summarization evaluations at Text Analysis Conferences (TAC) 2008. ...
متن کامل