The Role of Disfluencies in Topic Classification of Human-Human Conversations

نویسندگان

  • Constantinos Boulis
  • Jeremy G. Kahn
  • Mari Ostendorf
چکیده

We investigate the impact of disfluencies on the task of classifying natural human-human conversations into topics. Disfluencies are distinctive to spoken language, and their effect on a number of spoken language understanding tasks, including spoken language classification, remains largely unknown. We use a subset of Switchboard-I annotated for disfluencies and topics, and investigate the effect of different disfluency categories with both true and automatically generated transcripts. We show that under the popular bag-of-words representation, even perfect disfluency filtering has a minimal impact on topic classification performance on hand-transcribed data. However, difference are larger with more complex representations (e.g. bigrams) and for some classifiers operating on recognizer transcripts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Disfluencies and human language comprehension.

Spoken language contains disfluencies, which include editing terms such as uh and um as well as repeats and corrections. In less than ten years the question of how disfluencies are handled by the human sentence comprehension system has gone from virtually ignored to a topic of major interest in computational linguistics and psycholinguistics. We discuss relevant empirical findings and describe ...

متن کامل

Towards Improving the Naturalness of Social Conversations with Dialogue Systems

We describe an approach to improving the naturalness of a social dialogue system, Talkie, by adding disfluencies and other content-independent enhancements to synthesized conversations. We investigated whether listeners perceive conversations with these improvements as natural (i.e., human-like) as human-human conversations. We also assessed their ability to correctly identify these conversatio...

متن کامل

Role of Epigenetics in Biology and Human Diseases

For a long time, scientists have tried to describe disorders just by genetic or environmental factors. However, the role of epigenetics in human diseases has been considered from a half of century ago. In the last decade, this subject has attracted many interests, especially in complicated disorders such as behavior plasticity, memory, cancer, autoimmune disease, and addiction as well as neurod...

متن کامل

Machine Learning and Citizen Science: Opportunities and Challenges of Human-Computer Interaction

Background and Aim: In processing large data, scientists have to perform the tedious task of analyzing hefty bulk of data. Machine learning techniques are a potential solution to this problem. In citizen science, human and artificial intelligence may be unified to facilitate this effort. Considering the ambiguities in machine performance and management of user-generated data, this paper aims to...

متن کامل

شناسایی خطاهای انسانی در اپراتورهای اتاق کنترل با استفاده از تکنیک HEIST در یک شرکت نفتی

  Background and aims: Considering the role of human errors in the incidence of catastrophic events in control rooms and also Lack of effectiveness of classical techniques to identify the human errors, special techniques are required for identification of human errors. Therefore, this study aimed to identify human errors in the control room in an oil company Using HEIST Technique.   Methods: Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005