Improvements on Automatic Speech Segmentation at the Phonetic Level
نویسندگان
چکیده
In this paper, we present some recent improvements in our automatic speech segmentation system, which only needs the speech signal and the phonetic sequence of each sentence of a corpus to be trained. It estimates a GMM by using all the sentences of the training subcorpus, where each Gaussian distribution represents an acoustic class, which probability densities are combined with a set of conditional probabilities in order to estimate the probability densities of the states of each phonetic unit. The initial values of the conditional probabilities are obtained by using a segmentation of each sentence assigning the same number of frames to each phonetic unit. A DTW algorithm fixes the phonetic boundaries using the known phonetic sequence. This DTW is a step inside an iterative process which aims to segment the corpus and re-estimate the conditional probabilities. The results presented here demonstrate that the system has a good capacity to learn how to identify the phonetic boundaries.
منابع مشابه
Regional Pronunciation Variants for Automatic Segmentation
The goal of this paper is to create an extended rule corpus with approximately 2300 phonetic rules which model segmental variation of regional variants of German. The phonetic rules express at a broad-phonetic level phenomena of phonetic reduction in German that occurs within words and across word boundaries. In order to get an improvement in automatic segmentation of regional speech variants, ...
متن کاملSpeech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers
In spite of decades of research, Automatic Speech Recognition (ASR) is far from reaching the goal of performance close to Human Speech Recognition (HSR). One of the reasons for unsatisfactory performance of the state-of-the-art ASR systems, that are based largely on Hidden Markov Models (HMMs), is the inferior acoustic modeling of low level or phonetic level linguistic information in the speech...
متن کاملEvaluation of a segmentation system based on multi-level lattices
This paper addresses the problem of the evaluation of an automatic continuous speech segmentation system based on multi-level lattices called dendrograms previously described in [2] [3]. After a short overview of the segmentation framework, we briefly discuss the difficulties inherent to such an evaluation. We first state quantitative results based on the usual quality coefficient and oversegme...
متن کاملA Comparison of Different Approaches to Automatic Speech Segmentation
We compare different methods for obtaining accurate speech segmentations starting from the corresponding orthography. The complete segmentation process can be decomposed into two basic steps. First, a phonetic transcription is automatically produced with the help of large vocabulary continuous speech recognition (LVCSR). Then, the phonetic information and the speech signal serve as input to a s...
متن کاملAutomatic segmentation of speech based on hidden Markov models and acoustic features
An accurate database segmented and labeled at phonetic, subword or word level is very important for speech research. However, manual segmentation and labeling is a time consuming and error prone task. This paper describes an automatic procedure for the segmentation of speech in a set of acoustic sub-words units: given either the linguistic or the phonetic content of a speech utterance, the syst...
متن کامل