نتایج جستجو برای: speech feature extraction

تعداد نتایج: 480138  

1997
Carlos Avendano Sangita Tibrewala

To overcome the problems related with the long impulse responses produced by reverberation, we use a long time window (high frequency resolution) analysis during the channel normalization steps of the feature extraction process in automatic speech recognition (ASR). After nor-malization, a trade between frequency and time resolution is used to increase the rate at which the time information is ...

2007
Thomas Hueber Gérard Chollet Bruce Denby Gérard Dreyfus Maureen Stone

The article describes a video-only speech recognition system for a “silent speech interface” application, using ultrasound and optical images of the voice organ. A one-hour audiovisual speech corpus was phonetically labeled using an automatic speech alignment procedure and robust visual feature extraction techniques. HMM-based stochastic models were estimated separately on the visual and acoust...

2016
Masood Delfarah DeLiang Wang

Monaural speech separation in reverberant conditions is very challenging. In masking-based separation, features extracted from speech mixtures are employed to predict a time-frequency mask. Robust feature extraction is crucial for the performance of supervised speech separation in adverse acoustic environments. Using objective speech intelligibility as the metric, we investigate a wide variety ...

2016
Alicia Lozano-Diez Anna Silnova Pavel Matějka Ondřej Glembek Oldřich Plchot Jan Pešán Lukáš Burget Joaquin Gonzalez-Rodriguez

Recently, Deep Neural Network (DNN) based bottleneck features proved to be very effective in i-vector based speaker recognition. However, the bottleneck feature extraction is usually fully optimized for speech rather than speaker recognition task. In this paper, we explore whether DNNs suboptimal for speech recognition can provide better bottleneck features for speaker recognition. We experimen...

2008
Yasunari Obuchi Masahito Togami Takashi Sumiyoshi

We introduce a new class of speech processing, called Intentional Voice Command Detection (IVCD). It is necessary to reject not only noises but also unintended voices to achieve completely hands-free speech interface. Conventional VAD framework is not sufficient for such purpose, and we discuss how we should define IVCD and how we can realize it. We investigate implementation of IVCD from the v...

2002
Y. S. Naous G. F. Choueiter M. I. Ohannessian M. A. Al-Alaoui

In this paper, we describe a typical approach for implementing a voice based control solution. Isolated word speech recognition is performed using cepstral feature extraction and hidden Markov modeling of speech. The merit of this document lies in the amalgamation of the simplest yet most successful relevant methods into a coherent design guideline, aiming to trivialize the integration of speec...

1985
Chia-ying Lee James R. Glass Oded Ghitza Hung-an Chang Ekapol Chuangsuwanich Yuan Shen Stephen Shum Yaodong Zhang

A closed-loop auditory based speech feature extraction algorithm is presented to address the problem of unseen noise for robust speech recognition. This closed-loop model is inspired by the possible role of the medial olivocochlear (MOC) efferent system of the human auditory periphery, which has been suggested in [6, 13, 42] to be important for human speech intelligibility in noisy environment....

1998
Juergen Luettin

We address the problem of robust lip tracking, visual speech feature extraction, and sensor integration for audiovisual speech recognition applications. An appearance based model of the articulators, which represents linguistically important features, is learned from example images and is used to locate, track, and recover visual speech information. We t a c kle the problem of joint temporal mo...

2010
Nishi Sharma Parminder Singh

This paper presents an ASS (Automatic Speech Segmentation) Technique to segment spontaneous speech into syllable like units. In the development of a syllable-centric ASS system, segmentation of the acoustic signal into syllabic units is an important stage. In this paper we focus on the identifying minimum unit of speech to be considered while training any speech recognition system. There are sy...

Journal: :CoRR 2016
Toni Heidenreich Michael W. Spratling

Visual speech recognition aims to identify the sequence of phonemes from continuous speech. Unlike the traditional approach of using 2D image feature extraction methods to derive features of each video frame separately, this paper proposes a new approach using a 3D (spatio-temporal) Discrete Cosine Transform to extract features of each feasible sub-sequence of an input video which are subsequen...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید