Spotting "hot spots" in meetings: human judgments and prosodic cues
نویسندگان
چکیده
Recent interest in the automatic processing of meetings is motivated by a desire to summarize, browse, and retrieve important information from lengthy archives of spoken data. One of the most useful capabilities such a technology could provide is a way for users to locate “hot spots” or regions in which participants are highly involved in the discussion (e.g. heated arguments, points of excitement, etc.). We ask two questions about hot spots in meetings in the ICSI Meeting Recorder corpus. First, we ask whether involvement can be judged reliably by human listeners. Results show that despite the subjective nature of the task, raters show significant agreement in distinguishing involved from non-involved utterances. Second, we ask whether there is a relationship between human judgments of involvement and automatically extracted prosodic features of the associated regions. Results show that there are significant differences in both F0 and energy between involved and non-involved utterances. These findings suggest that humans do agree to some extent on the judgment of hot spots, and that acoustic-only cues could be used for automatic detection of hot spots in natural meetings.
منابع مشابه
Spotting “Hot Spots” i Human Judgments and P
Recent interest in the automatic processing of meetings is motivated by a desire to summarize, browse, and retrieve important information from lengthy archives of spoken data. One of the most useful capabilities such a technology could provide is a way for users to locate “hot spots” or regions in which participants are highly involved in the discussion (e.g. heated arguments, points of excitem...
متن کاملAudio Hot Spotting And Retrieval Using Multiple Features
This paper reports our on-going efforts to exploit multiple features derived from an audio stream using source material such as broadcast news, teleconferences, and meetings. These features are derived from algorithms including automatic speech recognition, automatic speech indexing, speaker identification, prosodic and audio feature extraction. We describe our research prototype – the Audio Ho...
متن کاملThe use of phrase-level prosodic information in lexical segmentation: evidence from word-spotting experiments in Korean.
This study investigated the role of phrase-level prosodic boundary information in word segmentation in Korean with two word-spotting experiments. In experiment 1, it was found that intonational cues alone helped listeners with lexical segmentation. Listeners paid more attention to local intonational cues (...H#L...) across the prosodic boundary than the intonational information within a prosodi...
متن کاملProduction of English Lexical Stress by Persian EFL Learners
This study examines the phonetic properties of lexical stress in English produced by Persian speakers learning English as a foreign language. The four most reliable phonetic correlates of English lexical stress, namely fundamental frequency, duration, intensity, and vowel quality were measured across Persian speakers’ production of the stressed and unstressed syllables of five English disyllabi...
متن کاملDirect Modeling of Prosody: An Overview of Applications in Automatic Speech Processing
We describe a “direct modeling” approach to using prosody in various speech technology tasks. The approach does not involve any hand-labeling or modeling of prosodic events such as pitch accents or boundary tones. Instead, prosodic features are extracted directly from the speech signal and from the output of an automatic speech recognizer. Machine learning techniques then determine a prosodic m...
متن کامل