Transmissions and transitions: a study of two common assumptions in multi-band ASR

نویسندگان

Nikki Mirghafori

Nelson Morgan

چکیده

Is multi-band ASR inherently inferior to a full-band approach because phonetic information is lost due to the division of the frequency space into sub-bands? Do the phonetic transitions in sub-bands occur at different times? The first statement is a common objection of the critics of multi-band ASR, and the second, a common assumption by multi-band researchers. This paper is dedicated to finding answers to both these questions. To study the first point, we calculate phonetic feature transmission for sub-bands. Not only do we fail to substantiate the above objection, but we observe the contrary. We confirm the second hypothesis by analyzing the phonetic transition lags in each sub-band. These results reinforce our view that multi-band speech analysis provides useful information for ASR, particularly when band merging takes place at the end state for a phonetic or syllabic model, allowing sub-bands to be independently time-aligned within the model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asynchrony with trained transition probabilities improves performance in multi-band speech recognition

One of the central themes in multi-band automatic speech recognition (ASR) is to devise a strategy for recombining sub-band information. This in turn raises two questions: (1) at what phonetic unit should the recombination take place? (2) How asynchronously should the sub-bands be run? Theoretically asynchronous multi-band ASR should perform at least as well as synchronous multi-band ASR. Howev...

متن کامل

Multi-stream adaptive evidence combination for noise robust ASR

In this paper, we develop dierent mathematical models in the framework of the multi-stream paradigm for noise robust automatic speech recognition (ASR), and discuss their close relationship with human speech perception. Largely inspired by Fletcher's ``product-of-errors'' rule (PoE rule) in psychoacoustics, multi-band ASR aims for robustness to data mismatch through the exploitation of spectra...

متن کامل

Some Applications of a Priori Knowledge in Multi-stream Hmm and Hmm/ann Based Asr

Multi-band ASR was largely inspired by the extremely high level of redundancy in the spectral signal representation which can be inferred from Fletcher’s product-oferrors rule for human speech perception. Indeed, the main aim of the multi-band approach is to exploit this redundancy in order to overcome the problem of data mismatch (while making no assumptions about noise type) by focusing recog...

متن کامل

3D Classification of Urban Features Based on Integration of Structural and Spectral Information from UAV Imagery

Three-dimensional classification of urban features is one of the important tools for urban management and the basis of many analyzes in photogrammetry and remote sensing. Therefore, it is applied in many applications such as planning, urban management and disaster management. In this study, dense point clouds extracted from dense image matching is applied for classification in urban areas. Appl...

متن کامل

Combining connectionist multi-band and full-band probability streams for speech recognition of natural numbers

Multi-band automatic speech recognition is a new and exploratory area of speech recognition which has been getting much attention in the research community. It has been shown that multiband ASR reduces word error in noisy conditions, particularly in the case of narrow band noise. In this work we show that multi-band ASR could be used to improve the speech recognition accuracy of natural numbers...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

Transmissions and transitions: a study of two common assumptions in multi-band ASR

نویسندگان

چکیده

منابع مشابه

Asynchrony with trained transition probabilities improves performance in multi-band speech recognition

Multi-stream adaptive evidence combination for noise robust ASR

Some Applications of a Priori Knowledge in Multi-stream Hmm and Hmm/ann Based Asr

3D Classification of Urban Features Based on Integration of Structural and Spectral Information from UAV Imagery

Combining connectionist multi-band and full-band probability streams for speech recognition of natural numbers

عنوان ژورنال:

اشتراک گذاری