On compressibility of neural network phonological features for low bit rate speech coding

نویسندگان

  • Afsaneh Asaei
  • Milos Cernak
  • Hervé Bourlard
چکیده

Phonological features extracted by neural network have shown interesting potential for low bit rate speech vocoding. The span of phonological features is wider than the span of phonetic features, and thus fewer frames need to be transmitted. Moreover, the binary nature of phonological features enables a higher compression ratio at minor quality cost. In this paper, we study the compressibility and structured sparsity of the phonological features. We propose a compressive sampling framework for speech coding and sparse reconstruction for decoding prior to synthesis. Compressive sampling is found to be a principled way for compression in contrast to the conventional pruning approach; it leads to 50% reduction in the bit-rate for better or equal quality of the decoded speech. Furthermore, exploiting the structured sparsity and binary characteristic of these features have shown to enable very low bitrate coding at 700 bps with negligible quality loss; this coding scheme imposes no latency. If we consider a latency of 256 ms for supra-segmental structures, the rate of 250 − 350 bps is achieved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimum Drill Bit Selection by Using Bit Images and Mathematical Investigation

This study is designed to consider the two important yet often neglected factors, which are factory recommendation and bit features, in optimum bit selection. Image processing techniques have been used to consider the bit features. A mathematical equation, which is derived from a neural network model, is used for drill bit selection to obtain the bit’s maximum penetration rate that corresponds ...

متن کامل

Speaker dependent mapping for low bit rate coding of throat microphone speech

Throat microphones (TM) which are robust to background noise can be used in environments with high levels of background noise. Speech collected using TM is perceptually less natural. The objective of this paper is to map the spectral features (represented in the form of cepstral features) of TM and close speaking microphone (CSM) speech to improve the former’s perceptual quality, and to represe...

متن کامل

Speaker Recognition with Mismatched Coded Speech

This paper investigates the effects of low-bit rate coded speech on the performance of a fixedtext speaker recognition system, under mismatched coding conditions between enrollment and testing. Significant degradation of performance has been observed relative to matched conditions, where same coding is used. Two techniques have been proposed to overcome mismatch effects; a linear discriminative...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015