Distributed Speech Recognition Usin Traps-estimated Manne
نویسندگان
چکیده
In this paper, we investigate the use of TemPoRal PatternS (TRAPS) classifiers for estimating manner of articulation features on the small-vocabulary Aurora-2002 database. By combining a stream of TRAPS-estimated manner features with a stream of noise-robust MFCC features (earlier proposed in the Aurora-2002 evaluation by OGI, ICSI and Qualcomm), we obtain an average absolute improvement of 0.4% to 1.0% in word recognition accuracy over noiserobust MFCC baseline features on Aurora tasks. This yields an average relative improvement of 54% over the reference end-pointed MFCC baseline. Estimation of the manner features can be performed on the server without increasing the terminal-side computational complexity in a distributed speech recognition (DSR) system.
منابع مشابه
Band-independent speech-event categ
Band-independent categories are investigated for feature estimation in ASR. These categories represent distinct speechevents manifested in frequency-localized temporal patterns of the speech signal. A universal, single estimator is proposed for estimating speech-event posterior probabilities using temporal patterns of critical-band energies for all the bands. The estimated posteriors are used a...
متن کاملLost Speech Reconstruction Method usin Missing Feature Theory and HMM
In recent years, IP telephone service has spread rapidly. However, an unavoidable problem of IP telephone service is deterioration of speech due to packet loss, which often occurs on wireless networks. To overcome this problem, we propose a novel lost speech reconstruction method using speech recognition based on Missing Feature Theory and HMM-based speech synthesis. The proposed method uses li...
متن کاملImproving out-of-coverage lang multimodal dialogue system usin
For automatic speech recognition, the construction of an adequate language model may be difficult when only a limited amount of training text is available. Previous work has shown that in the case of small training sets statistical language models may outperform grammars on out-of-coverage utterances, while showing comparable performance on incoverage input. In this paper, we compare the perfor...
متن کاملRobust Speech Recognition Usin Intra-speaker Ada
Inter-speaker variation can be coped rather well in speech recognition by speaker adaptation techniques such as MLLR and MAP. However, when dealing with speech other than reading style, such as conversational speech, emotional speech and so on, current recognition systems cannot achieve a satisfactory performance even after speaker adaptation. In view of this situation, two-level adaptation met...
متن کاملTemporal Patterns ( Traps ) in Asr of Noisy
Phoenix, Arizona, USA, March 1999. TEMPORAL PATTERNS (TRAPS) IN ASR OF NOISY SPEECH Hynek Hermansky1;2 and Sangita Sharma1 1Oregon Graduate Institute of Science and Technology, Portland, Oregon , USA. 2International Computer Science Institute, Berkeley, California, USA. Email: hynek,[email protected] ABSTRACT In this paper we study a new approach to processing temporal information for automat...
متن کامل