Features interpolation domain for distributed speech recognition and performance for ITU-t g.723.1 CODEC

نویسندگان

  • Vladimir Fabregas Surigué de Alencar
  • Abraham Alcaim
چکیده

In this paper, we examine the best domain to perform features interpolation in Distributed Speech Recognition (DSR) systems. We show that the only one domain where a performance gain can be achieved from the linear interpolation procedure is in the Line Spectral Frequencies (LSF) domain. A DSR scenario where the ITU-T G.723.1 codec is employed is also investigated. The recognition feature generated from the reconstructed speech is highly sensitive to the encoding noise. We have also shown that the LSF quantization scheme used by the G.723.1 codec decreases the recognition performance by approximately 2 %.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Speaker Recognition in the Presence of Speech Coding Distortion for Remote Access Applications

For wireless remote access security, forensics, border control and surveillance applications, there is an emerging need for biometric speaker recognition systems to be robust to speech coding distortion. This paper examines the robustness issue for three codecs, namely, the ITU-T 6.3 kilobits per second (kb/s) G.723.1, the ITU-T 8 kb/s G.729 and the 12.2 kb/s 3GPP GSM-AMR coder. Both speaker id...

متن کامل

A Fast LSF Search Algorithm Based on Interframe Correlation in G.723.1

We explain a time complexity reduction algorithm that improves the line spectral frequencies (LSF) search procedure on the unit circle for low bit rate speech codecs. The algorithm is based on strong interframe correlation exhibited by LSFs. The fixed point C code of ITU-T Recommendation G.723.1, which uses the “real root algorithm” wasmodified and the results were verified on ARM7TDMI general ...

متن کامل

Complexity Reduction of the Stochastic Code -Vector Search for ITU-T G.723.1codec

For multimedia communications, the computational complexity of a multimedia codec is required to match with different working platforms and integrated services of media sources. In this paper, two fast stochastic codebook search algorithms are proposed to reduce the computation required for the algebraic code excited linear predictive (ACELP) and multi-pulse maximum likelihood quantization (MP-...

متن کامل

Clusions, and Recommendations Are Those of the Authors and Arenot Necessarily Endorsed by the United States

In this paper, we investigate the e ect of speech coding on speaker and language recognition tasks. Three coders were selected to cover a wide range of quality and bit rates: GSM at 12.2 kb/s, G.729 at 8 kb/s, and G.723.1 at 5.3 kb/s. Our objective is to measure recognition performance from either the synthesized speech or directly from the coder parameters themselves. We show that using speech...

متن کامل

Speaker and language recognition using speech codec parameters

In this paper, we investigate the e ect of speech coding on speaker and language recognition tasks. Three coders were selected to cover a wide range of quality and bit rates: GSM at 12.2 kb/s, G.729 at 8 kb/s, and G.723.1 at 5.3 kb/s. Our objective is to measure recognition performance from either the synthesized speech or directly from the coder parameters themselves. We show that using speech...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007