Context-sensitive probabilistic phone mapping model for cross-lingual speech recognition
نویسندگان
چکیده
This paper presents a probabilistic phone mapping model (PPM) that makes possible automatic speech recognition using a foreign phonetic system. We formulate the training of the phone mapping model in the framework of maximum likelihood estimation. The model can be learned automatically from the reference phonetic transcript and the phonetic transcript resulting from a foreign phonetic recogniser using the Expectation Maximisation algorithm. This paper also compares the use of temporal and spatial contexts to enchance the phone mapping performance. A decision tree clustering technique is used to tie unseen contexts for robustness. We evaluate the PPM method on cross-lingual phone and isolated word recognition tasks, using the Hungarian and Russian phone recognisers to recognise Czech speech. Consistent improvement is obtained by using context-dependent phone mapping.
منابع مشابه
Stream-based context-sensitive phone mapping for cross-lingual speech recognition
Recently, a Probabilistic Phone Mapping (PPM) model was proposed to facilitate cross-lingual automatic speech recognition using a foreign phonetic system. Under this framework, discrete hidden Markov models (HMMs) are used to map a foreign phone sequence to a target phone sequence. Context-sensitive mapping is made possible by expanding the discrete observation symbols to include the contexts o...
متن کاملContext-dependent phone mapping for LVCSR of under-resourced languages
This paper presents a context-dependent phone mapping approach for acoustic modeling of large vocabulary speech recognition for under-resourced languages by leveraging on well trained models of other languages. Generally speaking, phone mapping can be considered as a hybrid HMM/MLP (Hidden Markov Model / Multilayer Perceptron) model where the input of the MLP is phone acoustic scores, e.g. like...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملMandarin/English mixed-lingual name recognition for mobile phone
Speaker independent name speech recognition has become hot application in handheld devices such as mobile phones and personal digit assistants (PDAs). This paper presents a new mixed-lingual ASR system that will enable Chinese mobile phone users to conduct Mandarin and English name speech recognition simultaneously without switching language modes. We created an elaborately designed mixed acous...
متن کاملState mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis
A phone mapping-based method had been introduced for cross-lingual speaker adaptation in HMM-based speech synthesis. In this paper, we continue to propose a state mapping based method for cross-lingual speaker adaptation, where the state mapping between voice models in source and target languages is established under minimum Kullback-Leibler divergence (KLD) criterion. We introduce two approach...
متن کامل