Cross-lingual Information Retrieval Using Hidden Markov Models
نویسندگان
چکیده
This paper presents empirical results in cross-lingual information retrieval using English queries to access Chinese documents (TREC-5 and TREC-6) and Spanish documents (TREC-4). Since our interest is in languages where resources may be minimal, we use an integrated probabilistic model that requires only a bilingual dictionary as a resource. We explore how a combined probability model of term translation and retrieval can reduce the effect of translation ambiguity. In addition, we estimate an upper bound on performance, if translation ambiguity were a solved problem. We also measure performance as a function of bilingual dictionary size.
منابع مشابه
Cheap Bootstrap of Multi-Lingual Hidden Markov Models
In this work we investigate the usage of TV audio data for cross-language training of multi-lingual acoustic models. We intend to take advantage from the availability of a training speech corpus formed by parallel news uttered in different languages and transmitted over separated audio channels. Spanish, French and Russian phone Hidden Markov Models (HMMs) are bootstrapped using an unsupervised...
متن کاملLearning Semantics with Deep Belief Network for Cross-Language Information Retrieval
This paper introduces a cross-language information retrieval (CLIR) framework that combines the state-of-the-art keyword-based approach with a latent semantic-based retrieval model. To capture and analyze the hidden semantics in cross-lingual settings, we construct latent semantic models that map text in different languages into a shared semantic space. Our proposed framework consists of deep b...
متن کاملIntroducing Busy Customer Portfolio Using Hidden Markov Model
Due to the effective role of Markov models in customer relationship management (CRM), there is a lack of comprehensive literature review which contains all related literatures. In this paper the focus is on academic databases to find all the articles that had been published in 2011 and earlier. One hundred articles were identified and reviewed to find direct relevance for applying Markov models...
متن کاملState mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis
A phone mapping-based method had been introduced for cross-lingual speaker adaptation in HMM-based speech synthesis. In this paper, we continue to propose a state mapping based method for cross-lingual speaker adaptation, where the state mapping between voice models in source and target languages is established under minimum Kullback-Leibler divergence (KLD) criterion. We introduce two approach...
متن کاملCross-Lingual Alignment of FrameNet Annotations through Hidden Markov Models
The development of annotated resources in the area of frame semantics has been crucial to the development of robust systems for shallow semantic parsing. Resource-poor languages have shown a significant delay due to the lack of sufficient training data. Recent works proposed to exploit parallel corpora in order to automatically transfer the semantic information available for English to other ta...
متن کامل