Improving the Accuracy of the Speech Synthesis Based Phonetic Alignment Using Multiple Acoustic Features

نویسندگان

Sérgio Paulo

Luís C. Oliveira

چکیده

The phonetic alignment of the spoken utterances for speech research are commonly performed by HMM-based speech recognizers, in forced alignment mode, but the training of the phonetic segment models requires considerable amounts of annotated data. When no such material is available, a possible solution is to synthesize the same phonetic sequence and align the resulting speech signal with the spoken utterances. However, without a careful choice of acoustic features used in this procedure, it can perform poorly when applied to continuous speech utterances. In this paper we propose a new method to select the best features to use in the alignment procedure for each pair of phonetic segment classes. The results show that this selection considerably reduces the segment boundary location errors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

طراحی الگوریتم بازشناسی واجها با به کارگیری همبسته های آکوستیکی مشخصه های واجی

In the present paper, the phonological feature geometry of the Persian phonemes is analyzed in the form of articulate-free and articulate-bound features based on the articulator model of the nonlinear phonology. Then, the reference phonetic pattern of each feature that consists of one or a set of acoustic correlates, characterized by the quantitative or qualitative values in its phonological re...

متن کامل

DTW-based phonetic alignment using multiple acoustic features

This paper presents the results of our effort in improving the accuracy of a DTW-based automatic phonetic aligner. The adopted model assumes that the phonetic segment sequence is already known and so the goal is only to align the spoken utterance with a reference synthetic signal produced by waveform concatenation without prosodic modifications. Instead of using a single acoustic measure to com...

متن کامل

Highly accurate phonetic segmentation using boundary correction models and system fusion

Accurate phone-level segmentation of speech remains an important task for many subfields of speech research. We investigate techniques for boosting the accuracy of automatic phonetic segmentation based on HMM acoustic-phonetic models. In prior work [25] we were able to improve on state-of-the-art alignment accuracy by employing special phone boundary HMM models, trained on phonetically segmente...

متن کامل

Areal and Phylogenetic Features for Multilingual Speech Synthesis

We introduce phylogenetic and areal language features to the domain of multilingual text-to-speech synthesis. Intuitively, enriching the existing universal phonetic features with crosslingual shared representations should benefit the multilingual acoustic models and help to address issues like data scarcity for low-resource languages. We investigate these representations using the acoustic mode...

متن کامل

Acoustic measures vs. phonetic features as predictors of audible discontinuity in concatenative speech synthesis

Most concatenative speech synthesizers employ both acoustic measures and phonetic features to predict the perceptual damage caused by concatenating two waveform segments because no reliable acoustic measure has been found so far. This paper compares the predicting ability of the two kinds of predictor variables. We first conduct a perceptual experiment to measure the naturalness degradation due...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Improving the Accuracy of the Speech Synthesis Based Phonetic Alignment Using Multiple Acoustic Features

نویسندگان

چکیده

منابع مشابه

طراحی الگوریتم بازشناسی واجها با به کارگیری همبسته های آکوستیکی مشخصه های واجی

DTW-based phonetic alignment using multiple acoustic features

Highly accurate phonetic segmentation using boundary correction models and system fusion

Areal and Phylogenetic Features for Multilingual Speech Synthesis

Acoustic measures vs. phonetic features as predictors of audible discontinuity in concatenative speech synthesis

عنوان ژورنال:

اشتراک گذاری