Syllable-based acoustic modeling for Japanese spontaneous speech recognition

نویسندگان

  • Jun Ogata
  • Yasuo Ariki
چکیده

We study on a syllable-based acoustic modeling method for Japanese spontaneous speech recognition. Traditionally, mora-based acoustic models have been adopted for Japanese read speech recognition systems. In this paper, syllable-based unit and mora-based unit are clearly distinguished in their definition, and syllables are shown to be more suitable as an acoustic model for Japanese spontaneous speech recognition. In spontaneous speech, a vowel lengthening occurs frequently, and recognition accuracy is greatly affected by this phenomena. From this viewpoint, we propose an acoustic modeling technique that explicitly incorporates the vowel lengthening in syllable-based HMMs. Experimental results showed that the proposed model could exceed the performance of conventionally used cross-word triphone model and mora-based model in Japanese spontaneous speech recognition task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acoustic modeling for spontaneous speech recognition using syllable dependent models

This paper proposes a syllable context dependent model for spontaneous speech recognition. It is generally assumed that, since spontaneous speech is greatly affected by coarticulation, an acoustic model featuring a longer range phonemic context is required to achieve a high degree of recognition accuracy. This motivated the authors to investigate a tri-syllable model that takes differences in t...

متن کامل

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

Modeling Frequent Allophones in Jap

In this paper, we describe a technique to model frequent allophones in Japanese speech recognition. The Consonant-Vowel syllabic structure (CV) is dominant in Japanese. Based on frequency, the distribution of CV pairs is rather skewed. Isolating out the most frequent allophones through the use of additional phonemes in acoustic modeling can achieve better recognition accuracy. By introducing te...

متن کامل

Towards the creation of acoustic models for stressed Japanese speech

In error recovery utterance, the user using the speech recognition system changes his or her speaking style to aid the system in recognizing the speech. However, this change leads the mismatch between the acoustic models and reduces the performance of the system. This degradation causes a serious problem of speech recognition for a dialog system or a speech translation system. In error recovery...

متن کامل

Recent Progress in Corpus-Based Spontaneous Speech Recognition

This paper overviews recent progress in the development of corpus-based spontaneous speech recognition technology. Although speech is in almost any situation spontaneous, recognition of spontaneous speech is an area which has only recently emerged in the field of automatic speech recognition. Broadening the application of speech recognition depends crucially on raising recognition performance f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003