Content-free Topic Segmentation with Acoustic Features (Report)
نویسنده
چکیده
In my previous work, content-free topic segmentation is approached by classification methods, and the unit is Vocalization [6]. Speaker ID, vocalization start time, vocalization duration, pause, overlaps and their corresponding Horizon features are emphasized. This followed an approach to segmentation and classification introduced by Luz [2, 3] for analysing recordings of multidisciplinary medical meetings. In this study, I follow previous experiment settings, but focus on acoustic features, exploring the effect of acoustic features on topic segmentation/ vocalization classification. Zero-crossing rate (ZCR) and root mean square (RMS) are well studied features in audio analysis. In the following sections, I explain the method to extract ZCR and RMS from WAV files, and integrate them to ARFF files.
منابع مشابه
An Analysis of Content-free Dialogue Representation, Supervised Classification Methods and Evaluation Metrics for Meeting Topic Segmentation
Automatic topic segmentation in meeting recordings is intensively investigated due to the fact that topic is a salient discourse structure and it indicates natural reference points for contents. Unlike commonly used text-based topic segmentation methods, this thesis investigates content-free topic segmentation methods. Among the reasons for investigating such methods are: understanding the infl...
متن کاملDiscourse Segmentation of Multi-Party Conversation
We present a domain-independent topic segmentation algorithm for multi-party speech. Our feature-based algorithm combines knowledge about content using a text-based algorithm as a feature and about form using linguistic and acoustic cues about topic shifts extracted from speech. This segmentation algorithm uses automatically induced decision rules to combine the different features. The embedded...
متن کاملMaking Sense of Sound: Unsupervised Topic Segmentation over Acoustic Input
We address the task of unsupervised topic segmentation of speech data operating over raw acoustic information. In contrast to existing algorithms for topic segmentation of speech, our approach does not require input transcripts. Our method predicts topic changes by analyzing the distribution of reoccurring acoustic patterns in the speech signal corresponding to a single speaker. The algorithm r...
متن کاملAutomatic segmentation of speech based on hidden Markov models and acoustic features
An accurate database segmented and labeled at phonetic, subword or word level is very important for speech research. However, manual segmentation and labeling is a time consuming and error prone task. This paper describes an automatic procedure for the segmentation of speech in a set of acoustic sub-words units: given either the linguistic or the phonetic content of a speech utterance, the syst...
متن کاملCombining Prosodic and Text Features for Segmentation of Mandarin Broadcast News
Automatic topic segmentation, separation of a discourse stream into its constituent stories or topics, is a necessary preprocessing step for applications such as information retrieval, anaphora resolution, and summarization. While significant progress has been made in this area for text sources and for English audio sources, little work has been done in automatic, acoustic feature-based segment...
متن کامل