Improving the performance of HMM-based very low bit rate speech coding

نویسندگان

  • Takahiro Hoshiya
  • Shinji Sako
  • Heiga Zen
  • Keiichi Tokuda
  • Takashi Masuko
  • Takao Kobayashi
  • Tadashi Kitamura
چکیده

In this paper, we define an F0 quantization scheme for a very low bit rate speech coder based on HMM (Hidden Markov Model). In the coding system, the encoder carries out phoneme recognition, and transmits phoneme indices, state durations and F0 information to the decoder. In the decoder, phoneme HMMs are concatenated according to the phoneme indices, and a sequence of mel-cepstral coefficient vectors is generated from the concatenated HMM. Finally we obtain synthetic speech by using the MLSA (Mel Log Spectrum Approximation) filter according to the mel-cepstral coefficients and F0 information. In addition to the F0 quantization, we investigate encoding methods for other parameters to reduce the bit rate, yet keeping the subjective speech quality. A subjective listening test shows that the performance of the proposed coder at about 100∼150 bit/s is superior to a VQ-based vocoder at 600 bit/s(mel-cepstrum: 6 bit/frame×50 frame/s, F0: 6 bit/frame×50 frame/s).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A very low bit rate speech coder using HMM-based speech recognition/synthesis techniques

This paper presents a very low bit rate speech coder based on HMM (Hidden Markov Model). The encoder carries out phoneme recognition, and transmits phoneme indexes, state durations and pitch information to the decoder. In the decoder, phoneme HMMs are concatenated according to the phoneme indexes, and a sequence of mel-cepstral coefficient vectors is generated from the concatenated HMM by using...

متن کامل

Dynamic Unit Selection for Very Low Bit Rate Coding at 500 bits/sec

This paper presents a new unit selection process for Very Low Bit Rate speech encoding around 500 bits/sec. The encoding is based on speech recognition and speech synthesis technologies. The aim of this approach is to use at best the speech corpus of the speaker. The proposed solution uses HMM modelling for the recognition of elementary speech units. The HMM are first trained in an unsupervised...

متن کامل

Progress Report of a Project in Very Low Bit-rate Speech Coding

Background work in various levels of speech coding is reviewed, including unconstrained coding and recognition-synthesis approaches that assume the signal is speech. A pilot project in HMM-TTS based speech coding is then described, in which a comparison with harmonic plus noise modelling is also done. Results of the demonstration project including samples of speech under various transmission si...

متن کامل

Stress and accent transmission in HMM-based syllable-context very low bit rate speech coding

In this paper, we propose a solution to reconstruct stress and accent contextual factors at the receiver of a very low bitrate speech codec built on recognition/synthesis architecture. In speech synthesis, accent and stress symbols are predicted from the text, which is not available at the receiver side of the speech codec. Therefore, speech signal-based symbols, generated as syllable-level log...

متن کامل

Speech enhancement based on hidden Markov model using sparse code shrinkage

This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003