Automatic detection of speaker state: Lexical, prosodic, and phonetic approaches to level-of-interest and intoxication classification

نویسندگان

  • William Yang Wang
  • Fadi Biadsy
  • Andrew Rosenberg
  • Julia Hirschberg
چکیده

Traditional studies of speaker state focus primarily upon one-stage classification techniques using standard acoustic features. In this article, we investigate multiple novel features and approaches to two recent tasks in speaker state detection: level-of-interest (LOI) detection and intoxication detection. In the task of LOI prediction, we propose a novel Discriminative TFIDF feature to capture important lexical information and a novel Prosodic Event detection approach using AuToBI; we combine these with acoustic features for this task using a new multilevel multistream prediction feedback and similarity-based hierarchical fusion learning approach. Our experimental results outperform published results of all systems in the 2010 Interspeech Paralinguistic Challenge – Affect Subchallenge. In the intoxication detection task, we evaluate the performance of Prosodic Event-based, phone duration-based, phonotactic, and phonetic-spectral based approaches, finding that a combination of the phonotactic and phoneticspectral approaches achieve significant improvement over the 2011 Interspeech Speaker State Challenge – Intoxication Subchallenge baseline. We discuss our results using these new features and approaches and their implications for future research. © 2012 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Production of English Lexical Stress by Persian EFL Learners

This study examines the phonetic properties of lexical stress in English produced by Persian speakers learning English as a foreign language. The four most reliable phonetic correlates of English lexical stress, namely fundamental frequency, duration, intensity, and vowel quality were measured across Persian speakers’ production of the stressed and unstressed syllables of five English disyllabi...

متن کامل

Does it Groove or does it Stumble - Automatic Classification of Alcoholic Intoxication using Prosodic Features

This paper studies how prosodic features can help in the automatic detection of alcoholic intoxication. We compute features that have recently been proposed to model speech rhythm such as the pair-wise variability index for consonantal and vocalic segments (PVI) and study their aptness for the task. Further, we use a large prosodic feature vector modelling the usual candidates – pitch, intensit...

متن کامل

Detecting Intoxication in Speech

Researchers at Columbia are investigating ways to automatically detect intoxication in speech. William Yang Wang, currently a PhD student at Carnegie Mellon that worked on this team while a Master's student, discussed the project and its goals with us. OVERVIEW Imagine a world where DUI's (driving under the influence violations) never occurred. How can this happen? Traditionally devices like br...

متن کامل

Use of prosodic speech characteristics for automated detection of alcohol intoxication

In this paper we describe our methodology for automatic detection of speaker alcoholization. Our task is restricted to detection of considerable alcoholization (alcohol blood level ≥ 0.8 per mille), so that a two-class classification problem is to be solved. In particular, our attention is focused on the influence of the alcohol intoxication on the prosodical aspect of the spoken language. A ne...

متن کامل

Drink and Speak: On the Automatic Classification of Alcohol Intoxication by Acoustic, Prosodic and Text-Based Features

This paper focuses on the automatic detection of a person’s blood level alcohol based on automatic speech processing approaches. We compare 5 different feature types with different ways of modeling. Experiments are based on the ALC corpus of IS2011 Speaker State Challenge. The classification task is restricted to the detection of a blood alcohol level above 0.5 ‰. Three feature sets are based o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer Speech & Language

دوره 27  شماره 

صفحات  -

تاریخ انتشار 2013