Hierarchical Phoneme Classification for Improved Speech Recognition

نویسندگان

چکیده

Speech recognition consists of converting input sound into a sequence phonemes, then finding text for the using language models. Therefore, phoneme classification performance is critical factor successful implementation speech system. However, correctly distinguishing phonemes with similar characteristics still challenging problem even state-of-the-art methods, and errors are hard to be recovered in subsequent processing steps. This paper proposes hierarchical clustering method exploit more suitable models different phonemes. The TIMIT database carefully analyzed confusion matrix from baseline model. Using automatic results, set optimized generated groups constructed integrated method. According results number experiments, proposed group improved over by 3%, 2.1%, 6.0%, 2.2% fricative, affricate, stop, nasal sounds, respectively. average accuracy was 69.5% 71.7% models, showing overall improvement.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition

In this paper, we carry out two experiments on the TIMIT speech corpus with bidirectional and unidirectional Long Short Term Memory (LSTM) networks. In the first experiment (framewise phoneme classification) we find that bidirectional LSTM outperforms both unidirectional LSTM and conventional Recurrent Neural Networks (RNNs). In the second (phoneme recognition) we find that a hybrid BLSTM-HMM s...

متن کامل

Diagnostics of speech recognition using classification phoneme diagnostic trees

More than three decades of speech recognition research resulted in a very sophisticated statistical framework. However, less attention was still devoted to diagnostics of speech recognition; most previous research report on results in terms of ever-lower WER in various intrinsic or environmental conditions. This paper presents a diagnostics of the decoding process of ASR systems. The purpose of...

متن کامل

An Online Algorithm for Hierarchical Phoneme Classification

Abstract. We present an algorithmic framework for phoneme classification where the set of phonemes is organized in a predefined hierarchical structure. This structure is encoded via a rooted tree which induces a metric over the set of phonemes. Our approach combines techniques from large margin kernel methods and Bayesian analysis. Extending the notion of large margin to hierarchical classifica...

متن کامل

The Gamma MLP for Speech Phoneme Recognition

We define a Gamma multi-layer perceptron (MLP) as an MLP with the usual synaptic weights replaced by gamma filters (as proposed by de Vries and Principe (de Vries and Principe, 1992)) and associated gain terms throughout all layers. We derive gradient descent update equations and apply the model to the recognition of speech phonemes. We find that both the inclusion of gamma filters in all layer...

متن کامل

Clustering beyond phoneme contexts for speech recognition

The clustering of using decision trees is generalized to take into account high-level knowledge sources to better model the co-articulation e ects in large vocabulary continuous speech recognition. VQ models are used to reduce the computational cost in constructing decision trees. The search algorithm is designed such that it can provide a general type of information for decision trees without ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied sciences

سال: 2021

ISSN: ['2076-3417']

DOI: https://doi.org/10.3390/app11010428