Multilingual Information Retrieval with Asian Languages

نویسنده

  • Jacques Savoy
چکیده

There has been increasing interest in the Chinese, Japanese and Korean languages on the Web and the first objective of this paper is to compare the retrieval performances of nine vector-space and two probabilistic models when carrying out a monolingual search using these three Asian languages. Based on the latest NTCIR-3 test collection, our second goal is to analyze the relative merit of using various automated tools to translate Englishlanguage topics into Chinese, Japanese or Korean, and then submitting a search based on texts written in these languages. Moreover, we will show how to improve bilingual searches by using both a combined translation strategy and a data fusion approach. Finally, we will address the underling problems of multilingual searches when an English topic is used to search documents written in the English, Chinese and Japanese languages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Evaluation of Systems for Cross-language Information Retrieval

We describe the creation of an infrastructure for the testing of cross-language text retrieval systems within the context of the Text REtrieval Conferences (TREC) organised by the US National Institute of Standards and Technology (NIST). The approach adopted and the issues that had to be taken into consideration when building a multilingual test suite and developing appropriate evaluation proce...

متن کامل

A method for multilingual text mining and retrieval using growing hierarchical self-organizing maps

With the increasing amount of multilingual texts in the Internet, multilingual text retrieval techniques have become an important research issue. However, the discovery of relationships between different languages remains an open problem. In this paper we propose a method, which applied the growing hierarchical self-organizing map (GHSOM) model, to discover knowledge from multilingual text docu...

متن کامل

A Review on the Cross and Multilingual Information Retrieval

In this paper we explore some of the most important areas of information retrieval. In particular, Crosslingual Information Retrieval (CLIR) and Multilingual Information Retrieval (MLIR). CLIR deals with asking questions in one language and retrieving documents in different language. MLIR deals with asking questions in one or more languages and retrieving documents in one or more different lang...

متن کامل

iAgent : A System for Managing Networked Tamil and Multilingual Information Resources

The advent of World Wide Web(WWW) has created a novel means for information dissemination whereby information resources all over the world can be made available to a user connected to the net anywhere and anytime. As more and more information resources are becoming available on the WWW, providing easy access to these information resources has become a significant service. In this paper we prese...

متن کامل

Domain Specific Information Retrieval in Multilingual Environment

In today’s world of globalization, local language storage and retrieval is essential for the developing nations like India. As our country is diversified by languages and only 10% of population is aware of English language, this diversity of languages is becoming barrier to understand and acquainted in digital world. It has been found that when services are provided in local languages, it has b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004