Automatic detection and segmentation of pronunciation variants in German speech corpora
نویسندگان
چکیده
In this paper we present a hybrid statistical and rule-based segmentation system which takes into account phonetic variation of German. Input to the system is the orthographic representation and the speech signal of an utterance to be segmented. The output is the transcription (SAM-PA) with the highest overall likelihood and the corresponding segmentation of the speech signal. The system consists of three main parts: In a rst stage the orthographic representation is converted into a linear string of phonetic units by lexicon lookup. Phonetic rules are applied yielding a graph that contains the canonic form and presumed variations. In a second HMM-based stage the speech signal of the concerning utterance is time-aligned by a Viterbi search which is constrained by the graph of the rst stage. The outcome of this stage is a string of phonetic labels and the corresponding segment boundaries. A rule-based re nement of the segment boundaries using phonetic knowledge takes place in a third stage.
منابع مشابه
Independent automatic segmentation by self-learning categorial pronunciation rules
The goal of this paper is to present a new method to automatically generate pronunciation rules for automatic segmentation of speech the German MAUSER system. MAUSER is an algorithm which generates pronunciation rules independently of any domain dependent training data either by clustering and statistically weighting self-learned rules according to a small set of phonological rules clustered by...
متن کاملRegional Pronunciation Variants for Automatic Segmentation
The goal of this paper is to create an extended rule corpus with approximately 2300 phonetic rules which model segmental variation of regional variants of German. The phonetic rules express at a broad-phonetic level phenomena of phonetic reduction in German that occurs within words and across word boundaries. In order to get an improvement in automatic segmentation of regional speech variants, ...
متن کاملAutomatic Phonetic Transcription of Non − Prompted Speech
Automatic Segmentation" (MAUS) system labels and segments the phonetic constituents of spoken German in a manner similar to highly trained phoneticians. MAUS has been used to train automatic speech recognition (ASR) systems as well as to provide detailed statistical analyses of spontaneous speech (using the Verbmobil I and RVG I corpora). The MAUS system is a reliable, automatic means of testin...
متن کاملStatistical Modelling of Pronunciation: It's Not the Model, It's the Data
In this paper we describe a method to model pronunciation for ASR in the German VERBMOBIL task. Our ndings suggest that a simple model, i.e. pronunciation variants modelled by SAM-PA units and weighted with a-posteriori probabilities, can be used successfully for ASR, if there is a su cient amount of reliably transcribed speech data available. Manual segmentation and labelling of speech (especi...
متن کاملPronuncation modeling applied to automatic segmentation of spontaneous speech
In this paper two di erent models of pronunciation are presented: the rst model is based on a rule set compiled by an expert, while the second is statistically based, exploiting a survey about pronunciation variants occurring in training data. Both models generate pronunciation variants from the canonic forms of words. The two models are evaluated by applying them to the task of automatic segme...
متن کامل