Deriving document structure from prosodic cues
نویسندگان
چکیده
This study presents an approach for prosody-driven segmentation of speech data. The model is based solely on F0 contours and RMS envelopes. Phoneme or word information from a speech recognizer is unneccesary. Using data from German broadcast news, we show how this prosodic information can be exploited to retrieve structural information of the spoken text. The suitability of the CART-like algorithm for utterance boundary prediction has been evaluated on 7 five-minutes-news-reports, using 28 reports as training material for the classification tree. Sentence boundaries were predicted with a precision of 93%, at a recall of 88%.
منابع مشابه
Production of English Lexical Stress by Persian EFL Learners
This study examines the phonetic properties of lexical stress in English produced by Persian speakers learning English as a foreign language. The four most reliable phonetic correlates of English lexical stress, namely fundamental frequency, duration, intensity, and vowel quality were measured across Persian speakers’ production of the stressed and unstressed syllables of five English disyllabi...
متن کاملProsodic Cues and Answer Type Detection for the Deception Sub-Challenge
Deception is a deliberate act to deceive interlocutor by transmitting a message containing false or misleading information. Detection of deception consists in the search for reliable differences between liars and truth-tellers. In this paper, we used the Deceptive Speech Database (DSD) provided for the Deception sub-challenge. DSD consists of deceptive and non-deceptive answers to a set of unkn...
متن کاملIdentification of vowel length, word stress, and compound words and phrases by postlingually deafened cochlear implant listeners.
BACKGROUND The accurate perception of prosody assists a listener in deriving meaning from natural speech. Few studies have addressed the ability of cochlear implant (CI) listeners to perceive the brief duration prosodic cues involved in contrastive vowel length, word stress, and compound word and phrase identification. PURPOSE To compare performance in the perception of brief duration prosodi...
متن کاملThe Prosody of Discourse Structure and Content in the Production of Persian EFL Learners
The present research addressed the prosodic realization of global and local text structure and content in the spoken discourse data produced by Persian EFL learners. Two newspaper articles were analyzed using Rhetorical Structure Theory. Based on these analyses, the global structure in terms of hierarchical level, the local structure in terms of the relative importance of text segments and the ...
متن کاملModelling the Interplay of Multiple Cues in Prosodic Focus Marking
Focus marking is an important function of prosody in many languages. While many phonological accounts concentrate on fundamental frequency (F0), studies have established several additional cues to information structure. However, the relationship between these cues is rarely investigated. We simultaneously analyzed five prosodic cues to focus—F0 range, word duration, intensity, voice quality, th...
متن کامل