Automatic Detection of Sentence Boundaries, Disfluencies, and Conversational Fillers in Spontaneous Speech
نویسندگان
چکیده
Automatic Detection of Sentence Boundaries, Disfluencies, and Conversational Fillers in Spontaneous Speech
منابع مشابه
Detecting Structural Metadata with Decision Trees and Transformation-Based Learning
The regular occurrence of disfluencies is a distinguishing characteristic of spontaneous speech. Detecting and removing such disfluencies can substantially improve the usefulness of spontaneous speech transcripts. This paper presents a system that detects various types of disfluencies and other structural information with cues obtained from lexical and prosodic information sources. Specifically...
متن کاملAutomatic detection of sentence boundaries and disfluencies based on recognized words
We study the problem of detecting linguistic events at interword boundaries, such as sentence boundaries and disfluency locations, in speech transcribed by an automatic recognizer. Recovering such events is crucial to facilitate speech understanding and other natural language processing tasks. Our approach is based on a combination of prosodic cues modeled by decision trees, and word-based even...
متن کاملFactors influencing ratios of filled pauses at clause boundaries in Japanese
Speech disfluencies have been studied as clues to human speech production mechanisms. Major constituents are assumed to be principal units of planning and disfluencies are claimed to occur when speakers have some trouble in planning such units. We tested two hypotheses about the probability of disfluencies by examining the ratios of filled pauses (fillers) at sentence and clause boundaries: 1) ...
متن کاملUm, one large pizza. A preliminary study of disfluency modelling for improving ASR
A corpus of spontaneous telephone transactions between call centre operators of a pizza company and its customers is examined for disfluencies (fillers and speech repairs) with the aim of improving automatic speech recognition. From this, a subset of the customer orders is selected as a test set. An architecture is presented which allows filled pauses and repairs to be detected and corrected. A...
متن کاملParsing Conversational Speech Using Enhanced Segmentation
The lack of sentence boundaries and presence of disfluencies pose difficulties for parsing conversational speech. This work investigates the effects of automatically detecting these phenomena on a probabilistic parser’s performance. We demonstrate that a state-of-the-art segmenter, relative to a pause-based segmenter, gives more than 45% of the possible error reduction in parser performance, an...
متن کامل