Multilingual Information Retrieval with Asian Languages
نویسنده
چکیده
There has been increasing interest in the Chinese, Japanese and Korean languages on the Web and the first objective of this paper is to compare the retrieval performances of nine vector-space and two probabilistic models when carrying out a monolingual search using these three Asian languages. Based on the latest NTCIR-3 test collection, our second goal is to analyze the relative merit of using various automated tools to translate Englishlanguage topics into Chinese, Japanese or Korean, and then submitting a search based on texts written in these languages. Moreover, we will show how to improve bilingual searches by using both a combined translation strategy and a data fusion approach. Finally, we will address the underling problems of multilingual searches when an English topic is used to search documents written in the English, Chinese and Japanese languages.
منابع مشابه
The Evaluation of Systems for Cross-language Information Retrieval
We describe the creation of an infrastructure for the testing of cross-language text retrieval systems within the context of the Text REtrieval Conferences (TREC) organised by the US National Institute of Standards and Technology (NIST). The approach adopted and the issues that had to be taken into consideration when building a multilingual test suite and developing appropriate evaluation proce...
متن کاملA method for multilingual text mining and retrieval using growing hierarchical self-organizing maps
With the increasing amount of multilingual texts in the Internet, multilingual text retrieval techniques have become an important research issue. However, the discovery of relationships between different languages remains an open problem. In this paper we propose a method, which applied the growing hierarchical self-organizing map (GHSOM) model, to discover knowledge from multilingual text docu...
متن کاملA Review on the Cross and Multilingual Information Retrieval
In this paper we explore some of the most important areas of information retrieval. In particular, Crosslingual Information Retrieval (CLIR) and Multilingual Information Retrieval (MLIR). CLIR deals with asking questions in one language and retrieving documents in different language. MLIR deals with asking questions in one or more languages and retrieving documents in one or more different lang...
متن کاملiAgent : A System for Managing Networked Tamil and Multilingual Information Resources
The advent of World Wide Web(WWW) has created a novel means for information dissemination whereby information resources all over the world can be made available to a user connected to the net anywhere and anytime. As more and more information resources are becoming available on the WWW, providing easy access to these information resources has become a significant service. In this paper we prese...
متن کاملDomain Specific Information Retrieval in Multilingual Environment
In today’s world of globalization, local language storage and retrieval is essential for the developing nations like India. As our country is diversified by languages and only 10% of population is aware of English language, this diversity of languages is becoming barrier to understand and acquainted in digital world. It has been found that when services are provided in local languages, it has b...
متن کامل