Features interpolation domain for distributed speech recognition and performance for ITU-t g.723.1 CODEC
نویسندگان
چکیده
In this paper, we examine the best domain to perform features interpolation in Distributed Speech Recognition (DSR) systems. We show that the only one domain where a performance gain can be achieved from the linear interpolation procedure is in the Line Spectral Frequencies (LSF) domain. A DSR scenario where the ITU-T G.723.1 codec is employed is also investigated. The recognition feature generated from the reconstructed speech is highly sensitive to the encoding noise. We have also shown that the LSF quantization scheme used by the G.723.1 codec decreases the recognition performance by approximately 2 %.
منابع مشابه
Robust Speaker Recognition in the Presence of Speech Coding Distortion for Remote Access Applications
For wireless remote access security, forensics, border control and surveillance applications, there is an emerging need for biometric speaker recognition systems to be robust to speech coding distortion. This paper examines the robustness issue for three codecs, namely, the ITU-T 6.3 kilobits per second (kb/s) G.723.1, the ITU-T 8 kb/s G.729 and the 12.2 kb/s 3GPP GSM-AMR coder. Both speaker id...
متن کاملA Fast LSF Search Algorithm Based on Interframe Correlation in G.723.1
We explain a time complexity reduction algorithm that improves the line spectral frequencies (LSF) search procedure on the unit circle for low bit rate speech codecs. The algorithm is based on strong interframe correlation exhibited by LSFs. The fixed point C code of ITU-T Recommendation G.723.1, which uses the “real root algorithm” wasmodified and the results were verified on ARM7TDMI general ...
متن کاملComplexity Reduction of the Stochastic Code -Vector Search for ITU-T G.723.1codec
For multimedia communications, the computational complexity of a multimedia codec is required to match with different working platforms and integrated services of media sources. In this paper, two fast stochastic codebook search algorithms are proposed to reduce the computation required for the algebraic code excited linear predictive (ACELP) and multi-pulse maximum likelihood quantization (MP-...
متن کاملClusions, and Recommendations Are Those of the Authors and Arenot Necessarily Endorsed by the United States
In this paper, we investigate the e ect of speech coding on speaker and language recognition tasks. Three coders were selected to cover a wide range of quality and bit rates: GSM at 12.2 kb/s, G.729 at 8 kb/s, and G.723.1 at 5.3 kb/s. Our objective is to measure recognition performance from either the synthesized speech or directly from the coder parameters themselves. We show that using speech...
متن کاملSpeaker and language recognition using speech codec parameters
In this paper, we investigate the e ect of speech coding on speaker and language recognition tasks. Three coders were selected to cover a wide range of quality and bit rates: GSM at 12.2 kb/s, G.729 at 8 kb/s, and G.723.1 at 5.3 kb/s. Our objective is to measure recognition performance from either the synthesized speech or directly from the coder parameters themselves. We show that using speech...
متن کامل