Towards improving statistical model based voice activity detection
نویسندگان
چکیده
Statistical model based voice activity detection (VAD) is commonly used in various speech related research and applications. In this paper, we try to improve the performance of statistical model based VAD via new feature extraction method. Our main innovation focuses on that we apply Mel-frequency subband coefficients with power-law nonlinearity as feature for statistical model based VAD instead of Discrete Fourier Transform (DFT) coefficients. This proposed feature is then modeled by Gaussian distribution. Performances of this method are comprehensively compared with existing methods. Meanwhile we also test power-law nonlinearity on existing methods. Experimental results prove that with proposed subband coefficients the performance of statistical model based VAD could be improved a lot. Power-law nonlinearity on DFT coefficients could also bring some improvement.
منابع مشابه
A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملEfficient Implementation of Statistical Model-Based Voice Activity Detection Using Taylor Series Approximation
In this letter, we propose a simple but effective technique that improves statistical model-based voice activity detection (VAD) by both reducing computational complexity and increasing detection accuracy. The improvements are made by applying Taylor series approximations to the exponential and logarithmic functions in the VAD algorithm based on an in-depth analysis of the algorithm. Experiment...
متن کاملA statistical model-based voice activity detection employing minimum classification error technique
In this paper, we apply a discriminative weight training to a statistical model-based voice activity detection (VAD). In our approach, the VAD decision rule is expressed as the geometric mean of optimally weighted likelihood ratios (LRs) based on a minimum classification error (MCE) method. That approach is different from that of previous works in that different weights are assigned to each fre...
متن کاملA Support Vector Machine-Based Voice Activity Detection Employing Effective Feature Vectors
In this letter, we propose effective feature vectors to improve the performance of voice activity detection (VAD) employing a support vector machine (SVM), which is known to incorporate an optimized nonlinear decision over two different classes. To extract the effective feature vectors, we present a novel scheme that combines the a posteriori SNR, a priori SNR, and predicted SNR, widely adopted...
متن کاملStatistical Model-Based Voice Activity Detection Based on Second-Order Conditional MAP with Soft Decision
© 2012 ETRI Journal, Volume 34, Number 2, April 2012 In this paper, we propose a novel approach to statistical model-based voice activity detection (VAD) that incorporates a second-order conditional maximum a posteriori (CMAP) criterion. As a technical improvement for the first-order CMAP criterion in [1], we consider both the current observation and the voice activity decision in the previous ...
متن کامل