Multi-Document Summarization Using Cross-Language Texts
نویسندگان
چکیده
Without a summarization system in source language, we try to generate a summary in source language, using translated documents by a machine translator and a summarization system in target language. For summarizing multiple documents translated by a machine translator, we extract important sentences, and remove redundant sentences using an improved term-weighting method. It assigns weights to words, using syntactic information. According to the score of the extracted sentence, we choose sentences, and map them to Japanese sentences in original documents. Finally, we arrange Japanese sentences in chronological order, and report them as the result of our system. We submitted both a short and long type of summary, and the evaluation of our results is not good. However, our approach shows the possibility of multi-documents summarization, using crosslanguage texts.
منابع مشابه
A Survey on Multi-Document Summarization
Multi-document summarization aims at delivering the majority of information content from multiple documents using much less lengthy texts, usually a short paragraph of several hundred words. This paper surveys several different approaches to multi-document summarization by first building a unified high level view of the multi-document summarization problem, and then comparing different approach...
متن کاملA Generative Approach for Multi-Document Summarization using Semantic-Discursive information
Multi-document summarization is the automatic production of a unique summary from a collection of texts. In this paper, we propose a statistical generative approach for multi-document summarization that combines simple information such as sentence position in the text and semantic-discursive information from CST (Cross-Document Structure Theory). In particular, we formulate the multi-document s...
متن کاملDeveloping Infrastructure for the Evaluation of Single and Multi-document Summarization Systems in a Cross-lingual Environment
We describe our work on the development of Language and Evaluation Resources for the evaluation of summaries in English and Chinese. The language resources include a parallel corpus of English and Chinese texts which are translations of each other, a set of queries in both languages, clusters of documents relevants to each query, sentence relevance measures for each sentence in the document clu...
متن کاملA Generative Approach for Multi-Document Summarization using the Noisy Channel Model
Multi-document summarization is the automatic production of a unique summary from a collection of texts. This task has become very important, since it assists the information processing in days where the amount of information is growing considerably. In this paper, we propose a statistical generative approach for multi-document summarization. In particular, we formulate the multi-document summa...
متن کاملAutomatic Summarization from Multiple Documents (Extended Abstract)
Since the late 50’s and Luhn [Luh58] the information community has expressed its interest in summarizing texts. The domains of application of such methodologies are countless, ranging from news summarization [WL03, BM05, ROWBG05] to scientific article summarization [TM02] and meeting summarization [NPDP05, ELH03]. Summarization has been defined as a reductive transformation of a given set of te...
متن کامل