speech feature extraction

Speaker-Independent Speech Recognition Using Acoustic Images Based On The TDRC

2004

Hossein Marvi Edward Chilton

Conventional cepstrums are one-dimensional, however speech characteristics are represented better by an acoustic image, a two-dimensional feature representation. In this paper, acoustic images based on two-dimensional root cepstrum (TDRC) are used as features for speaker-independent speech recognition. The TDRC is a method of feature extraction which has some advantages over other methods. The ...

متن کامل

Exploring Common Variations in State of the Art Chord Recognition Systems

2010

Taemin Cho Ron J. Weiss Juan P. Bello

Most automatic chord recognition systems follow a standard approach combining chroma feature extraction, filtering and pattern matching. However, despite much research, there is little understanding about the interaction between these different components, and the optimal parameterization of their variables. In this paper we perform a systematic evaluation including the most common variations i...

متن کامل

Visual Speech Recognition: A Solution from Feature Extraction to Words Classification

2003

Luciana Gonçalves da Silveira Jacques Facon Díbio Leandro Borges

Audio-visual Speech Recognition has been an active area of research lately. A bit, and yet unsolved, part of this problem is the visual only recognition, or lip reading. Considering an image sequence of a person pronouncing a word, a full image analysis solution would have to segment the mouth area, extract relevant features, and use them to be able to classify the word from those visual featur...

متن کامل

Ensemble Feature Extraction Modules for Improved Hindi Speech Recognition System

2012

Malay Kumar

Speech is the most natural way of communication between human beings. The field of speech recognition generates intrigues of man – machine conversation and due to its versatile applications; automatic speech recognition systems have been designed. In this paper we are presenting a novel approach for Hindi speech recognition by ensemble feature extraction modules of ASR systems and their outputs...

متن کامل

Dublin City University Video Track Experiments for TREC 2002

2002

Paul Browne Csaba Czirjek Cathal Gurrin Roman Jarina Hyowon Lee Seán Marlow Kieran McDonald Noel Murphy Noel E. O'Connor Alan F. Smeaton Jiamin Ye

Dublin City University participated in the Feature Extraction task and the Search task of the TREC-2002 Video Track. In the Feature Extraction task, we submitted 3 features: Face, Speech, and Music. In the Search task, we developed an interactive video retrieval system, which incorporated the 40 hours of the video search test collection and supported user searching using our own feature extract...

متن کامل

Hilbert Envelope Based Features for Far-Field Speech Recognition

2008

Samuel Thomas Sriram Ganapathy Hynek Hermansky

Automatic speech recognition (ASR) systems, trained on speech signals from close-talking microphones, generally fail in recognizing far-field speech. In this paper, we present a Hilbert Envelope based feature extraction technique to alleviate the artifacts introduced by room reverberations. The proposed technique is based on modeling temporal envelopes of the speech signal in narrow sub-bands u...

متن کامل

Optimized Feature Extraction and HMMs in Subword Detectors

2011

Alfonso M. Canterla Magne Hallstein Johnsen

This paper presents methods and results for optimizing subword detectors in continuous speech. Speech detectors are useful within areas like detection-based ASR, pronunciation training, phonetic analysis, word spotting, etc. We build detectors for both articulatory features and phones by discriminative training of detector-specific MFCC filterbanks and HMMs. The resulting filterbanks are clearl...

متن کامل

Automatic Call Quality Monitoring Using Cost-Sensitive Classification

2011

Youngja Park

In this paper, we propose advanced text analytics and costsensitive classification-based approaches for call quality monitoring and show that automatic quality monitoring with ASR transcripts can be achieved with a high accuracy. Our system analyzes ASR transcripts and determines if a call is a good call or a bad call. The set of features were identified through analysis of a large number of hu...

متن کامل

Articulatory acoustic feature applications in speech synthesis

2007

Peter Cahill Daniel Aioanei Julie Carson-Berndsen

The quality of unit selection speech synthesisers depends significantly on the content of the speech database being used. In this paper a technique is introduced that can highlight mispronunciations and abnormal units in the speech synthesis voice database through the use of articulatory acoustic feature extraction to obtain an additional layer of annotation. A set of articulatory acoustic feat...

متن کامل

Processing and Linking Audio Events in Large Multimedia Archives: The EU inEvent Project

2013

Hervé Bourlard Marc Ferras Nikolaos Pappas Andrei Popescu-Belis Steve Renals Fergus R. McInnes Peter Bell Sandy Ingram Maël Guillemot

In the inEvent EU project [1], we aim at structuring, retrieving, and sharing large archives of networked, and dynamically changing, multimedia recordings, mainly consisting of meetings, videoconferences, and lectures. More specifically, we are developing an integrated system that performs audiovisual processing of multimedia recordings, and labels them in terms of interconnected “hyper-events”...

متن کامل