Accelerating RNN-Based Speech Enhancement on a Multi-core MCU with Mixed FP16-INT8 Post-training Quantization
نویسندگان
چکیده
This paper presents an optimized methodology to design and deploy Speech Enhancement (SE) algorithms based on Recurrent Neural Networks (RNNs) a state-of-the-art MicroController Unit (MCU), with 1+8 general-purpose RISC-V cores. To achieve low-latency execution, we propose software pipeline interleaving parallel computation of LSTM or GRU recurrent blocks, featuring vectorized 8-bit integer (INT8) 16-bit floating-point (FP16) compute units, manually-managed memory transfers model parameters. ensure minimal accuracy degradation respect the full-precision models, novel FP16-INT8 Mixed-Precision Post-Training Quantization (PTQ) scheme that compresses layers while bit precision remaining is kept FP16. Experiments are conducted multiple SE models trained Valentini dataset, up 1.24M Thanks proposed approaches, speed-up by 4 $$\times $$ lossless FP16 baselines. Differently from uniform quantization degrades PESQ score 0.3 average, PTQ leads low-degradation only 0.06, achieving 1.4–1.7 saving. this compression, cut power cost external fitting large limited on-chip non-volatile gain MCU saving 2.5 reducing supply voltage 0.8 V 0.65 still matching real-time constraints. Our results >10 more energy efficient than solutions deployed single-core MCUs make use smaller quantization-aware training.
منابع مشابه
An RNN based speech recognition system with discriminative training
DISCRIMINATIVE TRAINING Tan Lee y, P.C. Chingy and L.W. Chanz y Department of Electronic Engineering z Department of Computer Science The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong. email : [email protected] Abstract In our previous work [1], a novel method of utilizing a set of fully connected recurrent neural networks (RNNs) for speech modeling has been proposed. Despi...
متن کاملA Speech Recognizer with Low Complexity Based on Rnn
Speech recognition systems (SRS) designed for applications in low cost products like telephones or in systems with energetic constraints like autonomous vehicles are faced with the demand for solutions with low complexity. A small vocabulary consisting of a few command words and the digits is suucient for most of the applications but has to be recognized robustly. Here we report about investiga...
متن کاملAccelerate RNN-based Training with Importance Sampling
Importance sampling (IS) as an elegant and efficient variance reduction (VR) technique for the acceleration of stochastic optimization problems has attracted many researches recently. Unlike commonly adopted stochastic uniform sampling in stochastic optimizations, IS-integrated algorithms sample training data at each iteration with respect to a weighted sampling probability distribution P , whi...
متن کاملA Multi-Microphone Post-Filtering Approach for Speech Enhancement
Multi-microphone post-filtering allows additional noise reduction at a beamformer output. Existing techniques are either restricted to classical delay-andsum beamformers, or are based on single-channel speech enhancement algorithms that are inefficient at attenuating transient noise. In this paper, we introduce a multimicrophone post-filtering approach, applicable to adaptive beamformer, that d...
متن کاملInvestigating RNN-based speech enhancement methods for noise-robust Text-to-Speech
The quality of text-to-speech (TTS) voices built from noisy speech is compromised. Enhancing the speech data before training has been shown to improve quality but voices built with clean speech are still preferred. In this paper we investigate two different approaches for speech enhancement to train TTS systems. In both approaches we train a recursive neural network (RNN) to map acoustic featur...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Communications in computer and information science
سال: 2023
ISSN: ['1865-0937', '1865-0929']
DOI: https://doi.org/10.1007/978-3-031-23618-1_41