نتایج جستجو برای: speech feature extraction

تعداد نتایج: 480138  

2004
Hossein Marvi Edward Chilton

Conventional cepstrums are one-dimensional, however speech characteristics are represented better by an acoustic image, a two-dimensional feature representation. In this paper, acoustic images based on two-dimensional root cepstrum (TDRC) are used as features for speaker-independent speech recognition. The TDRC is a method of feature extraction which has some advantages over other methods. The ...

2010
Taemin Cho Ron J. Weiss Juan P. Bello

Most automatic chord recognition systems follow a standard approach combining chroma feature extraction, filtering and pattern matching. However, despite much research, there is little understanding about the interaction between these different components, and the optimal parameterization of their variables. In this paper we perform a systematic evaluation including the most common variations i...

2003
Luciana Gonçalves da Silveira Jacques Facon Díbio Leandro Borges

Audio-visual Speech Recognition has been an active area of research lately. A bit, and yet unsolved, part of this problem is the visual only recognition, or lip reading. Considering an image sequence of a person pronouncing a word, a full image analysis solution would have to segment the mouth area, extract relevant features, and use them to be able to classify the word from those visual featur...

2012
Malay Kumar

Speech is the most natural way of communication between human beings. The field of speech recognition generates intrigues of man – machine conversation and due to its versatile applications; automatic speech recognition systems have been designed. In this paper we are presenting a novel approach for Hindi speech recognition by ensemble feature extraction modules of ASR systems and their outputs...

2002
Paul Browne Csaba Czirjek Cathal Gurrin Roman Jarina Hyowon Lee Seán Marlow Kieran McDonald Noel Murphy Noel E. O'Connor Alan F. Smeaton Jiamin Ye

Dublin City University participated in the Feature Extraction task and the Search task of the TREC-2002 Video Track. In the Feature Extraction task, we submitted 3 features: Face, Speech, and Music. In the Search task, we developed an interactive video retrieval system, which incorporated the 40 hours of the video search test collection and supported user searching using our own feature extract...

2008
Samuel Thomas Sriram Ganapathy Hynek Hermansky

Automatic speech recognition (ASR) systems, trained on speech signals from close-talking microphones, generally fail in recognizing far-field speech. In this paper, we present a Hilbert Envelope based feature extraction technique to alleviate the artifacts introduced by room reverberations. The proposed technique is based on modeling temporal envelopes of the speech signal in narrow sub-bands u...

2011
Alfonso M. Canterla Magne Hallstein Johnsen

This paper presents methods and results for optimizing subword detectors in continuous speech. Speech detectors are useful within areas like detection-based ASR, pronunciation training, phonetic analysis, word spotting, etc. We build detectors for both articulatory features and phones by discriminative training of detector-specific MFCC filterbanks and HMMs. The resulting filterbanks are clearl...

2011
Youngja Park

In this paper, we propose advanced text analytics and costsensitive classification-based approaches for call quality monitoring and show that automatic quality monitoring with ASR transcripts can be achieved with a high accuracy. Our system analyzes ASR transcripts and determines if a call is a good call or a bad call. The set of features were identified through analysis of a large number of hu...

2007
Peter Cahill Daniel Aioanei Julie Carson-Berndsen

The quality of unit selection speech synthesisers depends significantly on the content of the speech database being used. In this paper a technique is introduced that can highlight mispronunciations and abnormal units in the speech synthesis voice database through the use of articulatory acoustic feature extraction to obtain an additional layer of annotation. A set of articulatory acoustic feat...

2013
Hervé Bourlard Marc Ferras Nikolaos Pappas Andrei Popescu-Belis Steve Renals Fergus R. McInnes Peter Bell Sandy Ingram Maël Guillemot

In the inEvent EU project [1], we aim at structuring, retrieving, and sharing large archives of networked, and dynamically changing, multimedia recordings, mainly consisting of meetings, videoconferences, and lectures. More specifically, we are developing an integrated system that performs audiovisual processing of multimedia recordings, and labels them in terms of interconnected “hyper-events”...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید