Cross-lingual Information Retrieval Using Hidden Markov Models

نویسندگان

  • Jinxi Xu
  • Ralph M. Weischedel
چکیده

This paper presents empirical results in cross-lingual information retrieval using English queries to access Chinese documents (TREC-5 and TREC-6) and Spanish documents (TREC-4). Since our interest is in languages where resources may be minimal, we use an integrated probabilistic model that requires only a bilingual dictionary as a resource. We explore how a combined probability model of term translation and retrieval can reduce the effect of translation ambiguity. In addition, we estimate an upper bound on performance, if translation ambiguity were a solved problem. We also measure performance as a function of bilingual dictionary size.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cheap Bootstrap of Multi-Lingual Hidden Markov Models

In this work we investigate the usage of TV audio data for cross-language training of multi-lingual acoustic models. We intend to take advantage from the availability of a training speech corpus formed by parallel news uttered in different languages and transmitted over separated audio channels. Spanish, French and Russian phone Hidden Markov Models (HMMs) are bootstrapped using an unsupervised...

متن کامل

Learning Semantics with Deep Belief Network for Cross-Language Information Retrieval

This paper introduces a cross-language information retrieval (CLIR) framework that combines the state-of-the-art keyword-based approach with a latent semantic-based retrieval model. To capture and analyze the hidden semantics in cross-lingual settings, we construct latent semantic models that map text in different languages into a shared semantic space. Our proposed framework consists of deep b...

متن کامل

Introducing Busy Customer Portfolio Using Hidden Markov Model

Due to the effective role of Markov models in customer relationship management (CRM), there is a lack of comprehensive literature review which contains all related literatures. In this paper the focus is on academic databases to find all the articles that had been published in 2011 and earlier. One hundred articles were identified and reviewed to find direct relevance for applying Markov models...

متن کامل

State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis

A phone mapping-based method had been introduced for cross-lingual speaker adaptation in HMM-based speech synthesis. In this paper, we continue to propose a state mapping based method for cross-lingual speaker adaptation, where the state mapping between voice models in source and target languages is established under minimum Kullback-Leibler divergence (KLD) criterion. We introduce two approach...

متن کامل

Cross-Lingual Alignment of FrameNet Annotations through Hidden Markov Models

The development of annotated resources in the area of frame semantics has been crucial to the development of robust systems for shallow semantic parsing. Resource-poor languages have shown a significant delay due to the lack of sufficient training data. Recent works proposed to exploit parallel corpora in order to automatically transfer the semantic information available for English to other ta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000