Clustering-Based Language Independent Multiple-Document Summarizer at MSE 2006

نویسندگان

  • Angelo Dalli
  • Roberta Catizone
  • Yorick Wilks
چکیده

We describe our participation in the Multilingual Summarization Evaluation MSE 2006 where multiple documents in English, Arabic and Arabic-English machine translations are used to create a brief 100 word summary in English. Our system output was evaluated using the automated ROUGE evaluation system. The greedy optimization technique used to ensure that summaries always obey the length constraints while maximizing their score is described. A language-independent clustering mechanism is used to identify the most important sentences quickly and efficiently.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

IS_SUM: A Multi-Document Summarizer based on Document Index Graphic and Lexical Chains

IS_SUM is a summarizer developed at Institute of Software (IS) of Chinese Academy of Sciences for DUC 2005. We adopt a new way for clustering and summarizing documents by integrating Document Index Graphic (DIG) [7] with Lexical Chains [5]. Our results show the benefit of integrating DIG with Lexical Chains.

متن کامل

Multi-Objective Optimization for Clustering of Medical Publications

Clustering the results of a search can help a multi-document summarizer present a summary for evidence based medicine (EBM). In this work, we introduce a clustering technique that is based on multiobjective (MOO) optimization. MOO is a technique that shows promise in the areas of machine learning and natural language processing. In our approach we show how MOO based semi-supervised clustering t...

متن کامل

A Multi-Document Multi-Lingual Automatic Summarization System

Abstract. In this paper, a new multidocument multi-lingual text summarization technique, based on singular value decomposition and hierarchical clustering, is proposed. The proposed approach relies on only two resources for any language: a word segmentation system and a dictionary of words along with their document frequencies. The summarizer initially takes a collection of related documents, a...

متن کامل

CLASSY Arabic and English Multi-Document Summarization

Our Multilingual Summarization Evaluation entries for MSE-2006 were based upon an improved version of our CLASSY (Clustering, Linguistics, And Statistics for Summarization Yield) system. Our two entries were systems 20 and 21 and represented approaches based upon extracts from a) only English documents and b) English and the translated Arabic documents (full clusters). This paper presents a bri...

متن کامل

A Cognitive Interactive Framework for Multi-Document Summarizer

In this paper, we present a generic interactive framework based on human cognition, where the system can learn continuously from the Internet and from its interaction with the users. To show the utilization of this framework, Iintelli, an agent based application for multiple text document summarization is developed and compared with the MEAD on the Cran Data Set. Mead is a natural language proc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006