Deriving document structure from prosodic cues

نویسندگان

  • Martin Haase
  • Werner Kriechbaum
  • Gregor Möhler
  • Gerhard Stenzel
چکیده

This study presents an approach for prosody-driven segmentation of speech data. The model is based solely on F0 contours and RMS envelopes. Phoneme or word information from a speech recognizer is unneccesary. Using data from German broadcast news, we show how this prosodic information can be exploited to retrieve structural information of the spoken text. The suitability of the CART-like algorithm for utterance boundary prediction has been evaluated on 7 five-minutes-news-reports, using 28 reports as training material for the classification tree. Sentence boundaries were predicted with a precision of 93%, at a recall of 88%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Production of English Lexical Stress by Persian EFL Learners

This study examines the phonetic properties of lexical stress in English produced by Persian speakers learning English as a foreign language. The four most reliable phonetic correlates of English lexical stress, namely fundamental frequency, duration, intensity, and vowel quality were measured across Persian speakers’ production of the stressed and unstressed syllables of five English disyllabi...

متن کامل

Prosodic Cues and Answer Type Detection for the Deception Sub-Challenge

Deception is a deliberate act to deceive interlocutor by transmitting a message containing false or misleading information. Detection of deception consists in the search for reliable differences between liars and truth-tellers. In this paper, we used the Deceptive Speech Database (DSD) provided for the Deception sub-challenge. DSD consists of deceptive and non-deceptive answers to a set of unkn...

متن کامل

Identification of vowel length, word stress, and compound words and phrases by postlingually deafened cochlear implant listeners.

BACKGROUND The accurate perception of prosody assists a listener in deriving meaning from natural speech. Few studies have addressed the ability of cochlear implant (CI) listeners to perceive the brief duration prosodic cues involved in contrastive vowel length, word stress, and compound word and phrase identification. PURPOSE To compare performance in the perception of brief duration prosodi...

متن کامل

The Prosody of Discourse Structure and Content in the Production of Persian EFL Learners

The present research addressed the prosodic realization of global and local text structure and content in the spoken discourse data produced by Persian EFL learners. Two newspaper articles were analyzed using Rhetorical Structure Theory. Based on these analyses, the global structure in terms of hierarchical level, the local structure in terms of the relative importance of text segments and the ...

متن کامل

Modelling the Interplay of Multiple Cues in Prosodic Focus Marking

Focus marking is an important function of prosody in many languages. While many phonological accounts concentrate on fundamental frequency (F0), studies have established several additional cues to information structure. However, the relationship between these cues is rarely investigated. We simultaneously analyzed five prosodic cues to focus—F0 range, word duration, intensity, voice quality, th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001