A Comparative Study of Query and Document Translation for Cross-Language Information Retrieval

نویسنده

  • Douglas W. Oard
چکیده

Cross language retrieval systems use queries in one natural language to guide retrieval of documents that might be written in an other Acquisition and representation of translation knowledge plays a central role in this process This paper explores the utility of two sources of translation knowledge for cross language retrieval We have imple mented six query translation techniques that use bilingual term lists and one based on direct use of the translation output from an exist ing machine translation system these are compared with a document translation technique that uses output from the same machine transla tion system Average precision measures on a TREC collection suggest that arbitrarily selecting a single dictionary translation is typically no less e ective than using every translation in the dictionary that query translation using a machine translation system can achieve somewhat better e ectiveness than simpler techniques and that document transla tion may result in further improvements in retrieval e ectiveness under some conditions

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

A Comparative Study of Knowledge - Based Approachesfor Cross - Language Information Retrieval

Cross-language retrieval systems seek to use queries in one natural language to guide the retrieval of documents that might be written in another. Acquisition and representation of translation knowledge plays a central role in this process. This paper explores the utility of two sources of manually encoded translation knowledge, bilingual dictionaries and translation lexicons, for cross-languag...

متن کامل

Integrating Different Strategies for Cross-Language Information Retrieval in the MIETTA Project

In this paper we describe an integrated approach to cross-language retrieval within the MIETTA project, whose objective is to build a special purpose search engine in the tourism domain that covers information from a number of geographical regions. MIETTA is designed to enable users to search and retrieve information on the regions covered in their own language preferably. In order to facilitat...

متن کامل

JASIS Forthcoming –Jiangping Chen A Lexical Knowledge Base Approach for English-Chinese Cross Language Information Retrieval

This study proposes and explores a natural language processing (NLP) based strategy to address out-ofdictionary and vocabulary mismatch problems in query translation based English-Chinese Cross Language Information Retrieval (EC-CLIR). The strategy, named the LKB approach, is to construct a lexical knowledge base (LKB) and to use it for query translation. This paper describes the LKB constructi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998