Speech Recognition as a Retrieval Problem
نویسندگان
چکیده
Common approaches to automatic speech recognition (ASR) are based on training statistical models for the acoustics of speech. In our work, a retrieval-based ASR system is developed that does not rely on training and thus provides more flexible application. It is based on a set of known reference word utterances for each possibly occurring word in a test string. A test word string is identified by finding the most similar reference for each word by using an approach based on dynamic time warping (DTW). The DTW variant suitable for recognizing strings of connected words is called level-building DTW, proposed by Myers and Rabiner in 1981. It is using a level-by-level iteration to match each word in the test utterance with the most similar reference. In our work, an ASR system for connected digit recognition based on level-building DTW is developed, evaluated and compared with a state-of-the-art HMM recognizer.
منابع مشابه
Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملارائه یک روش جدید بازیابی اطلاعات مناسب برای متون حاصل از بازشناسی گفتار
In this article a pre-processing method is introduced which is applicable in speech recognized texts retrieval task. We have a text corpus, t generated from a speech recognition system and a query as inputs, to search queries in these documents and find relevant documents. A basic problem in a typical speech recognized text is some error percentage in recognition. This, results erroneously ass...
متن کاملOn the Use of Information Retrieval Measures for Speech Recognition Evaluation
This paper discusses the evaluation of automatic speech recognition (ASR) systems developed for practical applications, suggesting a set of criteria for application-oriented performance measures. The commonly used word error rate (WER), which poses ASR evaluation as a string editing process, is shown to have a number of limitations with respect to these criteria, motivating alternative or addit...
متن کاملApplication of Over-complete Blind so Automatic Speech Re
Spoken dialogue based information retrieval systems that are used in mobile environments are becoming popular. However, mobile environment is dynamically changing and there exists many interfering signals. These two effects result in degradation in automatic speech recognition (ASR) accuracy and hence, degradation in performance of spoken dialogue based information retrieval systems. One way to...
متن کاملSubword-based approaches for spoken document retrieval
This paper explores approaches to the problem of spoken document retrieval (SDR), which is the task of automatically indexing and then retrieving relevant items from a large collection of recorded speech messages in response to a user specified natural language text query. We investigate the use of subword unit representations for SDR as an alternative to words generated by either keyword spott...
متن کاملSpeech recognition in the Informedia Digital Video Library: uses and limitations
In principle, speech recognition technology can make any spoken data useful for library indexing and retrieval. This paper describes the Informedia Digital Video Library project and discusses how speech recognition is used for transcript creation from video, alignment with handgenerated transcripts, query interface and audio paragraph segmentation. The results show that speech recognition accur...
متن کامل