Adaptive weighting of microphone arrays for distant-talking F0 and voiced/unvoiced estimation
نویسندگان
چکیده
This paper introduces a new technique of multi-microphone processing which aims to provide features for the extraction of fundamental frequency and for the classification of voiced/unvoiced segments in distant-talking speech. A multichannel periodicity function (MPF) is derived from an adaptive weighting of normalized and compressed magnitude spectra. This function highlights periodic clues of the given speech signals, even under noisy and reverberant conditions. The resulting MPF features are then exploited for voiced/unvoiced classification based on Hidden Markov Models. Experiments, conducted both on simulated data and on real seminar recordings based on a network of reversed T-shaped arrays, showed the robustness of the proposed technique.
منابع مشابه
Using Noisy Speech to Study the Robustness of a Continuous F0 Modelling Method in HMM-based Speech Synthesis
In parametric text-to-speech synthesis using Hidden Markov Model (HMM), the fundamental frequency (F0) parameter modelling is important because it has a direct effect on the prosody of synthetic speech. F0 is typically modelled by a discrete distribution for unvoiced speech and a continuous distribution for voiced, by using a multi-space distribution (MSD). However, F0 modelling using MSD-HMM i...
متن کاملPredicting F0 and voicing from NAM-captured whispered speech
The NAM-to-speech conversion proposed by Toda and colleagues which converts Non-Audible Murmur (NAM) to audible speech by statistical mapping trained using aligned corpora is a very promising technique, but its performance is still insufficient, mainly due to the difficulty in estimating F0 of the transformed voice from unvoiced speech. In this paper, we propose a method to improve F0 estimatio...
متن کاملAccepted Manuscript Improvement to a Nam-captured Whisper-to-speech System Improvement to a Nam-captured Whisper-to-speech System
Exploiting a tissue-conductive sensor – a stethoscopic microphone – the system developed at NAIST which converts Non-Audible Murmur (NAM) to audible speech by GMM-based statistical mapping is a very promising technique. The quality of the converted speech is however still insufficient for computer-mediated communication, notably because of the poor estimation of F0 from unvoiced speech and beca...
متن کاملImprovement to a NAM captured whisper-to-speech system
Exploiting a tissue-conductive sensor – a stethoscopic microphone – the system developed at NAIST which converts Non-Audible Murmur (NAM) to audible speech by GMM-based statistical mapping is a very promising technique. The quality of the converted speech is however still insufficient for computer-mediated communication, notably because of the poor estimation of F0 from unvoiced speech and beca...
متن کاملEstimation of fundamental frequency from surface electromyographic data: EMG-to-F0
In this paper, we present our recent studies of F0 estimation from the surface electromyographic (EMG) data using a Gaussian mixture model (GMM)-based voice conversion (VC) technique, referred to as EMG-to-F0. In our approach, a support vector machine recognizes individual frames as unvoiced and voiced (U/V), and voiced F0 contours are discriminated by the trained GMMbased on the manner of mini...
متن کامل