Multi-State Time Delay Neural Networks for Continuous Speech Recognition
نویسندگان
چکیده
Alex Waibel Carnegie Mellon University Pittsburgh, PA 15213 [email protected] We present the "Multi-State Time Delay Neural Network" (MS-TDNN) as an extension of the TDNN to robust word recognition. Unlike most other hybrid methods. the MS-TDNN embeds an alignment search procedure into the connectionist architecture. and allows for word level supervision. The resulting system has the ability to manage the sequential order of subword units. while optimizing for the recognizer performance. In this paper we present extensive new evaluations of this approach over speaker-dependent and speaker-independent connected alphabet.
منابع مشابه
Integrating Time Alignment and Neural Networks for High Performance Continuous Speech Recognition
Successful application of existing connectionist methods to continuous speech recognition requires the use or time-alignment procedures. These procedures. usually based on dynamic programming, provide means for supervising the training of neural networks. This paper describes two systems in which neural network classifiers are merged with dynamic programming (DP) time alignment methods to produ...
متن کاملContinuous Speech Phoneme Recognition Using Dynamic Artificial Neural Networks
Phoneme classification and recognition is the first step to large vocabulary continuous speech recognition. This step represents the acoustic modeling part of such a system. In hybrid speech recognition systems phoneme recognition is made by artificial neural networks (ANN’s). The main objective of this paper is the investigation of dynamic ANN’s, namely the Time-Delay Neural Networks (TDNN) an...
متن کاملNew variant of the Self Organizing Map in Pulsed Neural Networks to Improve Phoneme Recognition in Continuous Speech
Speech recognition has gradually improved over the years, phoneme recognition in particular. Phoneme recognition plays very important role in speech processing. Phoneme strings are basic representation for automatic language recognition and it is proved that language recognition results are highly correlated with phoneme recognition results. Nowadays, many recognizers are based on Artificial ne...
متن کاملLearning of Word Boundaries In Continuous Speech Using Time Delay Neural Networks
This paper presents early research results for a method for training neural networks to segment previously unseen continuous speech data into words, without the use of a lexicon or a speech recognition engine, or any other forms of supervision. The initial word segmentation is derived through a simple and naï ve method, and our method then leverages the time-invariant properties of the TDNN to ...
متن کاملPerformance Through Consistency: MS-TDNN's for Large Vocabulary Continuous Speech Recognition
Connectionist Rpeech recognition systems are often handicapped by an inconsistency between training and testing criteria. This problem is addressed by the Multi-State Time Delay Neural Network (MS-TDNN), a hierarchical phonf'mp and word classifier which uses DTW to modulate its connectivit.y pattern, and which is directly trained on word-level targets. The consistent use of word accuracy as a c...
متن کامل