Study on tone classification of Chinese continuous speech in speech recognition system
نویسندگان
چکیده
In this paper, we first introduce the use of Gaussian mixture models (GMM) for Chinese tone classification in continuous speech. Then, we explain how to integrate it with the HMM-based speech recognition system. Finally, we provide the tone classification accuracy of this probabilistic method which is tested with Chinese continuous speech database of national “863” project.
منابع مشابه
Chinese language is a tonal language
In this paper, we first introduce the use of Gaussian mixture models (GMM) for Chinese tone classification in continuous speech. Then, we explain how to integrate it with the HMM-based speech recognition system. Finally, we provide the tone classification accuracy of this probabilistic method which is tested with Chinese continuous speech database of national “863” project.
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کامل