Using Information Extraction to Improve Cross-lingual Document Retrieval
نویسندگان
چکیده
We present a filtering mechanism using two cross-lingual information extraction (CLIE) systems for improving document relevance of cross-lingual information retrieval (CLIR) for queries conforming to predefined templates. Experiments on retrieving Chinese documents in response to English GALE arrest queries show that this approach can obtain a 12.7% absolute improvement in relevance (representing a 24.8% relative error reduction) for the top 25 retrieved documents. We also demonstrate that Chinese IE can provide a valuable supplement to English IE to enhance retrieval performance.
منابع مشابه
Experiments in Cross Language Query Focused Multi-Document Summarization
The twin challenges of massive information overload via the web and ubiquitous computers present us with an unavoidable task: developing techniques to handle multilingual information robustly and efficiently, with as high quality performance as possible. Previous research activities on multilingual information access systems have studied cross-language information retrieval (CLIR), information ...
متن کاملCross Lingual Query Dependent Snippet Generation
The present paper describes the development of a cross lingual query dependent snippet generation module. It is a language independent module, so it also performs as a multilingual snippet generation module. It is a module of the Cross Lingual Information Access (CLIA) system. This module takes the query and content of each retrieved document and generates a query dependent snippet for each ret...
متن کاملJAVELIN III: Cross-Lingual Question Answering from Japanese and Chinese Documents
In this paper, we describe the JAVELIN Cross Language Question Answering system, which includes modules for question analysis, keyword translation, document retrieval, information extraction and answer generation. In the NTCIR6 CLQA2 evaluation, our system achieved 19% and 13% accuracy in the English-to-Chinese and English-to-Japanese subtasks, respectively. An overall analysis and a detailed m...
متن کاملThe Future of Multilingual Summarization: Beyond Sentence Extraction
In this paper I present a vision for the future of multilingual summarization that focuses on summarizing differences between documents: generating sentences that explain the main points of controversy in the document set, identifying different sides in the dialogue and the claims they support, and identifying how content differs across document boundaries (cultural, national, political, etc.)....
متن کاملDocument Image Retrieval Based on Keyword Spotting Using Relevance Feedback
Keyword Spotting is a well-known method in document image retrieval. In this method, Search in document images is based on query word image. In this Paper, an approach for document image retrieval based on keyword spotting has been proposed. In proposed method, a framework using relevance feedback is presented. Relevance feedback, an interactive and efficient method is used in this paper to imp...
متن کامل