Automatic Reconstruction of Utterance Boundaries Time Marks in Speech Database Re-grabbed from DAT Recorder

نویسنده

  • Hynek Bořil
چکیده

In this paper, an algorithm performing automatic reconstruction of utterance boundaries time marks in speech database re-grabbed from DAT recorder is presented. Originally, the database was grabbed from DAT and, after down-sampling, processed at 16 kHz. Utterance boundaries were manually found, each utterance was stored to a separate file and orthographic and phonetic transcriptions were performed. Recently, a requirement to re-grab and process the database at 48 kHz has appeared. Since positions of utterance boundaries were known for 16 kHz, it was reasonable to use this information for the 48 kHz processing. Unfortunately, re-grabbed sessions displayed certain length changes compared to the originally grabbed data. Presented algorithm finds session boundaries, matches session lengths and calculates actual utterance boundaries positions. Finally, matching accuracy is automatically checked.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

On Word Boundary Detection in Digit-based Speaker Verification

In an automatic speaker verification (ASV) system with prompted passwords, we use vocabulary-dependent hidden Markov models and rely on the ability to explicitly locate the corresponding words and their boundaries in the speech signal. In an experiment on 41 speakers in a Swedish telephone speech database, we compare the use of utterance segmentation produced by automatic and manual methods, an...

متن کامل

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

Acoustic cues for the classification of re

Irregular phonation serves an important communicative function. It can be a cue to linguistic contrasts, and often serves as a marker for word and utterance boundaries. Automatic methods for classification and detection of regions of irregular phonation can be used to improve analyses of occurrences of irregular phonation and support technologies such as speech recognition and synthesis. This s...

متن کامل

Automatic detection and segmentation of pronunciation variants in German speech corpora

In this paper we present a hybrid statistical and rule-based segmentation system which takes into account phonetic variation of German. Input to the system is the orthographic representation and the speech signal of an utterance to be segmented. The output is the transcription (SAM-PA) with the highest overall likelihood and the corresponding segmentation of the speech signal. The system consis...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005