Exploiting speech knowledge in neural nets for recognition

نویسنده

  • Mark Huckvale
چکیده

This paper argues that neural networks are good vehicles for automatic speech recognition not simply because they provide non-linear pattern recognition but because their architecture allows the incorporation and exploitation of existing knowledge about speech. The paper is in two parts: Part I defends the need for the incorporation of existing knowledge, while Part II sketches a speech recognition architecture that uses neural networks to represent and exploit existing phonological and linguistic knowledge. The first part of the paper argues that the definition of the speech recognition problem implies that prior knowledge of linguistic analysis is essential for its solution, and suggests that the currently poor exploitation of such knowledge is a consequence of contemporary pattern recognition architectures. Criticism is made of the current emphasis on syntactic pattern recognition algorithms operating at the level of the phonetic segment. The second part of the paper demonstrates that a network architecture for the lexicon provides a mechanism for the incorporation and exploitation of a range of phonological analyses. Furthermore, through the explicit separation of phonological representations from phonetic ones, there exists the possibility of constructing a front-end phonetic component on purely pattern recognition principles. Through normalisation of speaker and environment, the phonetic component may be interfaced to the network lexicon to provide a complete recognition architecture which avoids compromise in the exploitation of speech knowledge.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

Consonant Recognition by Modular Construction of Large Phonemic Time-Delay Neural Networks

Alex Waibel Carnegie-Mellon University Pittsburgh, PA 15213, A TR Interpreting Telephony Research Laboratories Osaka, Japan In this paperl we show that neural networks for speech recognition can be constructed in a modular fashion by exploiting the hidden structure of previously trained phonetic subcategory networks. The performance of resulting larger phonetic nets was found to be as good as t...

متن کامل

Modularity and Scaling in Large Phonemic Neural Networks

Scaling connectionist models to larger connectionist systems is difficult because larger networks require increasing amounts of training time and data, and the complexity of the optimization task quickly reaches computationally unmanageable proportions. In this paper, we train several small Time-Delay Neural Networks aimed at all phonemic subcategories (nasals, fricatives, etc.) and report exce...

متن کامل

Modular Construction of Time-Delay Neural Networks for Speech Recognition

Several strategies are described that overcome limitations of basic network models as steps towards the design of large connectionist speech recognition systems. The two major areas of concern are the problem of time and the problem of scaling. Speech signals continuously vary over time and encode and transmit enormous amounts of human knowledge. To decode these signals, neural networks must be...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 9  شماره 

صفحات  -

تاریخ انتشار 1990