Automatic Language Identification Using Phoneme and Automatically Derived Unit Strings
نویسندگان
چکیده
Language identification (LID) based on phono-tactic modeling is presented in this paper. Approaches using phoneme strings and strings of units automatically derived by an Ergodic HMM (EHMM) are compared. The phoneme recognizers were trained on 6 languages from OGI multi-language-corpus and Czech SpeechDat-E. The LID results are obtained on 4 languages. The results show superiority of Czech phoneme recognizer while used in LID and promising trends using the EHMMderived units.
منابع مشابه
Extracting Chinese Frequent Strings Without a Dictionary From a Chinese Corpus and its Applications
This paper describes how to extract Chinese frequent strings without using a dictionary. In this paper, we generalize the notations of words and unknown words to those of frequent strings. The Chinese frequent strings (CFSs) we define include words, unknown words, and other strings that are frequently used. Some examples of CFSs are “ (can only let)”, “ (every minute and every second)”, “ (bear...
متن کاملTowards automatic speech recognition without pronunciation dictionary, transcribed speech and text resources in the target language using cross-lingual word-to-phoneme alignment
In this paper we tackle the task of bootstrapping an Automatic Speech Recognition system without an a priori given language model, a pronunciation dictionary, or transcribed speech data for the target language Slovene – only untranscribed speech and translations to other resource-rich source languages of what was said are available. Therefore, our approach is highly relevant for under-resourced...
متن کاملNew variant of the Self Organizing Map in Pulsed Neural Networks to Improve Phoneme Recognition in Continuous Speech
Speech recognition has gradually improved over the years, phoneme recognition in particular. Phoneme recognition plays very important role in speech processing. Phoneme strings are basic representation for automatic language recognition and it is proved that language recognition results are highly correlated with phoneme recognition results. Nowadays, many recognizers are based on Artificial ne...
متن کاملTheoretical error prediction for a language identification system using optimal phoneme clustering
using Optimal Phoneme Clustering Kay M. Berkling, Etienne Barnard (berkling,barnard)@cse.ogi.edu Center for Spoken Language Understanding, Oregon Graduate Institute of Science and Technology Abstract A neural network based language identi cation system is described, which uses language independent phoneme clusters as speech units to recognize the language spoken by native speakers over the tele...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کامل