Incremental acoustic subspace learning for voice activity detection using harmonicity-based features

نویسندگان

  • Jiaxing Ye
  • Takumi Kobayashi
  • Masahiro Murakawa
  • Tetsuya Higuchi
چکیده

This paper presents novel voice activity detection (VAD) approach based on incremental subspace learning using harmonicity-based features. Harmonic structure is well known as noise robust speech feature. We develop novel harmonicitybased feature based on temporal-spectral co-occurrence patterns. At statistical decision stage, many conventional statistical VAD methods rely on Gaussian model; however, owing to the non-Gaussian nature in speech, Gaussian model becomes faulty and produces incorrect VAD results. We reformulate the VAD by incremental subspace learning. The candid covariance-free incremental PCA (CCIPCA) subspace method is employed to adaptively model the input sound by a subspace. Subsequently, a speech activity measure can be established based on the distance from input sound to the adaptive subspace. Notably, the CCIPCA subspace update interval is set to 0.5 second in this work and the deviation distance is computed afterwards. In such short time scale, environmental sound present more Gaussian-like/stationary pattern and therefore can be well accommodated by adaptive subspace, conversely, speech always exhibit non-stationary characteristic which lead to distinct deviation to the adaptive acoustic subspace, and thus, can be effectively distinguished. We experimentally compared our scheme with various VAD methods over real-world data. The results validate the effectiveness of the proposed approach.

منابع مشابه

New harmonicity measures for pitch estimation and voice activity detection

Harmonic structure can be easily recognized in the timefrequency representation of speech signals even in the diverse environment. The harmonicity is a measure of the completeness of harmonic structure. This paper extends the use of conventional harmonicity measure to the tasks of pitch estimation and voice activity detection. A set of hierarchical harmonicities, including grid, temporal, spect...

متن کامل

Boosted deep neural networks and multi-resolution cochleagram features for voice activity detection

Voice activity detection (VAD) is an important frontend of many speech processing systems. In this paper, we describe a new VAD algorithm based on boosted deep neural networks (bDNNs). The proposed algorithm first generates multiple base predictions for a single frame from only one DNN and then aggregates the base predictions for a better prediction of the frame. Moreover, we employ a new acous...

متن کامل

An unsupervised visual-only voice activity detection approach using temporal orofacial features

Detecting the presence or absence of speech is an important step toward building robust speech-based interfaces. While previous studies have made progress on voice activity detection (VAD), the performance of these systems significantly degrades when subjects employ challenging speech modes that deviate from normal acoustic patterns (e.g., whisper speech), or in noisy/adverse conditions. An app...

متن کامل

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...

متن کامل

Noise Robust Voice Activity Detection

Voice activity detection (VAD) is a fundamental task in various speech-related applications, such as speech coding, speaker diarization and speech recognition. It is often defined as the problem of distinguishing speech from silence/noise. A typical VAD system consists of two core parts: a feature extraction and a speech/ non-speech decision mechanism. The first part extracts a set of parameter...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل
عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013