Acoustic Assessment of Disordered Voice with Continuous Speech Based on Utterance-Level ASR Posterior Features

نویسندگان

  • Yuanyuan Liu
  • Tan Lee
  • Pak-Chung Ching
  • Thomas K. T. Law
  • Kathy Y. S. Lee
چکیده

Most previous studies on acoustic assessment of disordered voice were focused on extracting perturbation features from isolated vowels produced with steady-state phonation. Natural speech, however, is considered to be more preferable in the aspects of flexibility, effectiveness and reliability for clinical practice. This paper presents an investigation on applying automatic speech recognition (ASR) technology to disordered voice assessment of Cantonese speakers. A DNN-based ASR system is trained using phonetically-rich continuous utterances from normal speakers. It was found that frame-level phone posteriors obtained from the ASR system are strongly correlated with the severity level of voice disorder. Phone posteriors in utterances with severe disorder exhibit significantly larger variation than those with mild disorder. A set of utterance-level posterior features are computed to quantify such variation for pattern recognition purpose. An SVM based classifier is used to classify an input utterance into the categories of mild, moderate and severe disorder. The two-class classification accuracy for mild and severe disorders is 90.3%, and significant confusion between mild and moderate disorders is observed. For some of the subjects with severe voice disorder, the classification results are highly inconsistent among individual utterances. Furthermore, short utterances tend to have more classification errors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بررسی برخی ویژگی های آکوستیک گفتار نوزاد مدار در مادران فارسی زبان

Introduction: When adults talk to another person, linguistic characteristics of the listener will also be considered. A clear example of speech changes depending on the listener is maternal or infant directed speech. Infant directed speech is more slowly with longer sentences and pauses at the end of the utterance. Undoubtedly the most distinctive feature of this style of speech is acoustic c...

متن کامل

Non-linear estimation of voice activ recognition of nois

Feed-forward multi-layer perceptrons (MLP) and recurrent neural networks (RNN) fed with different sets of acoustic features are proposed for computing the presence and absence of speech in continuous speech signal in presence of various levels of background noise. Detailed performance evaluations on voice activity detection (VAD) are reported using the Aurora2, Aurora3 and TIMIT corpora. It is ...

متن کامل

Combining Phonological and Acoustic ASR-Free Features for Pathological Speech Intelligibility Assessment

Intelligibility is widely used to measure the severity of articulatory problems in pathological speech. Recently, a number of automatic intelligibility assessment tools have been developed. Most of them use automatic speech recognizers (ASR) to compare the patient’s utterance with the target text. These methods are bound to one language and tend to be less accurate when speakers hesitate or mak...

متن کامل

The Effects of Size and Type of Vocal Fold Polyp on Some Acoustic Voice Parameters

Background: Vocal abuse and misuse would result in vocal fold polyp. Certain features define the extent of vocal folds polyp effects on voice acoustic parameters. The present study aimed to define the effects of polyp size on acoustic voice parameters, and compare these parameters in hemorrhagic and non-hemorrhagic polyps.Methods: In the present retrospective study, 28 individuals with hemorrha...

متن کامل

The effect of bilateral subthalamic nucleus deep brain stimulation (STN-DBS) on the acoustic and prosodic features in patients with Parkinson’s disease: A study protocol for the first trial on Iranian patients

Background: The effect of subthalamic nucleus deep brain stimulation (STN-DBS) on the voice features in Parkinson’s disease (PD) is controversial. No study has evaluated the voice features of PD underwent STN-DBS by the acoustic, perceptual, and patient-based assessments comprehensively. Furthermore, there is no study to investigate prosodic features before and after DBS in PD. The curren...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017