Hierarchical Phoneme Classification for Improved Speech Recognition
نویسندگان
چکیده
Speech recognition consists of converting input sound into a sequence phonemes, then finding text for the using language models. Therefore, phoneme classification performance is critical factor successful implementation speech system. However, correctly distinguishing phonemes with similar characteristics still challenging problem even state-of-the-art methods, and errors are hard to be recovered in subsequent processing steps. This paper proposes hierarchical clustering method exploit more suitable models different phonemes. The TIMIT database carefully analyzed confusion matrix from baseline model. Using automatic results, set optimized generated groups constructed integrated method. According results number experiments, proposed group improved over by 3%, 2.1%, 6.0%, 2.2% fricative, affricate, stop, nasal sounds, respectively. average accuracy was 69.5% 71.7% models, showing overall improvement.
منابع مشابه
Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition
In this paper, we carry out two experiments on the TIMIT speech corpus with bidirectional and unidirectional Long Short Term Memory (LSTM) networks. In the first experiment (framewise phoneme classification) we find that bidirectional LSTM outperforms both unidirectional LSTM and conventional Recurrent Neural Networks (RNNs). In the second (phoneme recognition) we find that a hybrid BLSTM-HMM s...
متن کاملDiagnostics of speech recognition using classification phoneme diagnostic trees
More than three decades of speech recognition research resulted in a very sophisticated statistical framework. However, less attention was still devoted to diagnostics of speech recognition; most previous research report on results in terms of ever-lower WER in various intrinsic or environmental conditions. This paper presents a diagnostics of the decoding process of ASR systems. The purpose of...
متن کاملAn Online Algorithm for Hierarchical Phoneme Classification
Abstract. We present an algorithmic framework for phoneme classification where the set of phonemes is organized in a predefined hierarchical structure. This structure is encoded via a rooted tree which induces a metric over the set of phonemes. Our approach combines techniques from large margin kernel methods and Bayesian analysis. Extending the notion of large margin to hierarchical classifica...
متن کاملThe Gamma MLP for Speech Phoneme Recognition
We define a Gamma multi-layer perceptron (MLP) as an MLP with the usual synaptic weights replaced by gamma filters (as proposed by de Vries and Principe (de Vries and Principe, 1992)) and associated gain terms throughout all layers. We derive gradient descent update equations and apply the model to the recognition of speech phonemes. We find that both the inclusion of gamma filters in all layer...
متن کاملClustering beyond phoneme contexts for speech recognition
The clustering of using decision trees is generalized to take into account high-level knowledge sources to better model the co-articulation e ects in large vocabulary continuous speech recognition. VQ models are used to reduce the computational cost in constructing decision trees. The search algorithm is designed such that it can provide a general type of information for decision trees without ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2021
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app11010428