Automatic estimation of dialect mixing ratio for dialect speech recognition

نویسندگان

Naoki Hirayama

Koichiro Yoshino

Katsutoshi Itoyama

Shinsuke Mori

Hiroshi G. Okuno

چکیده

This paper proposes methods for determining an appropriate mixing ratio of dialects in automatic speech recognition (ASR) for dialects. To handle ASR for various dialects, it has been reported to be effective to train a language model using a dialectmixed corpus. One reason behind this is geographical continuity of spoken dialect; we regard spoken dialect as a mixture of various dialects. This mixing ratio changes at every moment as well as depends on a speaker. We can improve recognition accuracy by giving an appropriate dialect mixing ratio for a speaker’s dialect. The mixing ratio is generally unknown and requires to be estimated and updated referring to input utterances. We handle two methods for updating it based on recognition results; one is to compute contribution of dialects for each recognized word, and the other is to predict mixture information referring to a whole recognized sentence based on topic modeling. The experimental result shows that the mixing ratio estimated by these methods realized higher recognition accuracy than a fixed mixing ratio.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Speech Corpora of Japanese Dialects

Clean speech data is necessary for spoken language processing, however, there is no public Japanese dialect corpus collected for speech processing. Parallel speech corpora of dialect are also important because real dialect affects each other, however, the existing data only includes noisy speech data of dialects and their translation in common language. In this paper, we collected parallel spee...

متن کامل

Automatic Dialect and Accent Recognition and its Application to Speech Recognition

متن کامل

Automatic Speech Recognition for Tunisian Dialect

Speech recognition for under-resourced languages represents an active field of research during the past decade. The tunisian arabic dialect has been chosen as a typical example for an under-resourced Arabic dialect. We propose, in this paper, our first steps to build an automatic speech recognition system for Tunisian dialect. Several Acoustic Models have been trained using HMM-GMM and HMM-DNN ...

متن کامل

Dialect separation assessment using log-likelihood score distributions

Dialect differences within a given language represent major challenges for sustained speech system performance. For speech recognition, little if any knowledge exists on differences between dialects (e.g. vocabulary, grammar, prosody, etc.). Effective dialect classification can contribute to improved ASR, speaker ID, and spoken document retrieval. This study, presents an approach to establish a...

متن کامل

Automatic initial/final generation for dialectal Chinese speech recognition

Phonetic differences always exist between any Chinese dialect and standard Chinese (Putonghua). In this paper, a method, named automatic dialect-specific Initial/Final (IF) generation, is proposed to deal with the issue of phonemic difference which can automatically produce the dialect-specific units based on model distance measure. A dialect-specific decision tree regrowing method is also prop...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Automatic estimation of dialect mixing ratio for dialect speech recognition

نویسندگان

چکیده

منابع مشابه

Parallel Speech Corpora of Japanese Dialects

Automatic Dialect and Accent Recognition and its Application to Speech Recognition

Automatic Speech Recognition for Tunisian Dialect

Dialect separation assessment using log-likelihood score distributions

Automatic initial/final generation for dialectal Chinese speech recognition

عنوان ژورنال:

اشتراک گذاری