Dialog Act Modeling for Automatic Tagging and Recognition of Conversational Speech
نویسندگان
چکیده
We describe a statistical approach for modeling dialog acts in conversational speech, i.e., speechact-like units such as Statement, Question, Backchannel, Agreement, Disagreement, and Apology. Our model detects and predicts dialog acts based on lexical, collocational, and prosodic cues, as well as on the discourse coherence of the dialog act sequence. The dialog model is based on treating the discourse structure of a conversation as a hidden Markov model and the individual dialog acts as observations emanating from the model states. Constraints on the likely sequence of dialog acts are modeled via a dialog act N-gram. The statistical dialog grammar is combined with word N-grams, decision trees, and neural networks modeling the idiosyncratic lexical and prosodic manifestations of each dialog act. We develop a probabilistic integration of speech recognition with dialog modeling, to improve both speech recognition and dialog act classification accuracy. Models are trained and evaluated using a large hand-labeled database of 1155 conversations from the Switchboard corpus of spontaneous human-to-human telephone speech. We achieved good dialog act labeling accuracy (65% based on errorful, automatically recognized words and prosody, and 71% based on word transcripts, compared to a chance baseline accuracy of 35% and human accuracy of 84%) and a small reduction in word recognition error.
منابع مشابه
Identifying Discourse Markers in Spoken Dialog
In this paper, we present a method for identifying discourse marker usage in spontaneous speech based on machine learning. Discourse markers are denoted by special POS tags, and thus the process of POS tagging can be used to identify discourse markers. By incorporating POS tagging into language modeling, discourse markers can be identified during speech recognition, in which the timeliness of t...
متن کاملAutomatic Detection of Discourse Structure for Speech Recognition and Understanding
We describe a new approach for statistical modeling and detection of discourse structure for natural conversational speech. Our model is based on 42 ‘Dialog Acts’ (DAs), (question, answer, backchannel, agreement, disagreement, apology, etc). We labeled 1155 conversations from the Switchboard (SWBD) database (Godfrey et al. 1992) of human-to-human telephone conversations with these 42 types and ...
متن کاملDirect Modeling of Prosody: An Overview of Applications in Automatic Speech Processing
We describe a “direct modeling” approach to using prosody in various speech technology tasks. The approach does not involve any hand-labeling or modeling of prosodic events such as pitch accents or boundary tones. Instead, prosodic features are extracted directly from the speech signal and from the output of an automatic speech recognizer. Machine learning techniques then determine a prosodic m...
متن کاملDialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech
We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speechact-like units such as STATEMENT,QUESTION, BACKCHANNEL,AGREEMENT, DISAGREEMENT, and APOLOGY. Our model detects and predicts dialogue acts based on lexical, collocational, and prosodic cues, as well as on the discourse coherence of the dialogue act sequence. The dialogue model is based on treating...
متن کاملسیستم برچسب گذاری اجزای واژگانی کلام در زبان فارسی
Abstract: Part-Of-Speech (POS) tagging is essential work for many models and methods in other areas in natural language processing such as machine translation, spell checker, text-to-speech, automatic speech recognition, etc. So far, high accurate POS taggers have been created in many languages. In this paper, we focus on POS tagging in the Persian language. Because of problems in Persian POS t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Linguistics
دوره 26 شماره
صفحات -
تاریخ انتشار 2000