Informatized Caption Enhancement Based on Ibm Watson Api and Speaker Pronunciation Time-db

نویسندگان

Yong-Sik Choi

YunSik Son

Jin-Woo Jung

چکیده

This paper aims to improve the inaccuracy problem of the existing informatized caption in the noisy environment by using the additional caption information. The IBM Watson API can automatically generate the informatized caption including the timing information and the speaker ID information from the voice information input. In this IBM Watson API, when there is noise in the voice signal, the recognition results are not good, causing the informatized caption error. Especially, it is more easily found in movies such as background music and special sound. Specifically, to reduce caption error, additional captions and voice information are entered at the same time, and the result of the informatized caption of voice information from IBM Watson API is compared with the original text to automatically detect and modify the error part. Based on the database containing the average pronunciation time, each word for each speaker is changed into the informatized caption in this process. In this way, more precise informatized captions could be generated based on the IBM Watson API.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CAPT and its Effect on English Language Pronunciation Enhancement: Evidence from Bilinguals and Monolinguals

Nowadays there are several challenges for English teachers as well as researchers regarding how to teach foreign language pronunciation more effectively. The current study aimed to explore the effect of computer-assisted pronunciation teaching (CAPT) on Persian monolinguals and Turkmen- Persian and also Baloch- Persian bilinguals’ pronunciation considering production and perception. A sample of...

متن کامل

Game-based Teaching of Stress Placement on Multi-syllabic English Words

Accurate pronunciation is an important component of language ability and the main outward linguistic sign of whether someone is a native speaker of a language or not. An area of particular difficulty for Persian-speaking learners of English, which may cause 'foreign accent' or misunderstanding in speaking, is placement of stress on multi-syllable words. Game-based pronunciation teaching can be ...

متن کامل

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

Articulatory feature-based conditional pronunciation modeling for speaker verification

Because of the differences in education background, accents, etc., different persons have their unique way of pronunciation. This paper exploits the pronunciation characteristics of speakers and proposes a new conditional pronunciation modeling (CPM) technique for speaker verification. The proposed technique aims to establish a link between articulatory properties (e.g., manners and places of a...

متن کامل

Improving pronunciation modeling for non-native speech recognition

In this paper, three different approaches to pronunciation modeling are investigated. Two existing pronunciation modeling approaches, namely the pronunciation dictionary and n-best rescoring approach are modified to work with little amount of non-native speech. We also propose a speaker clustering approach, which capable of grouping the speakers based on their pronunciation habits. Given some s...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2018

Informatized Caption Enhancement Based on Ibm Watson Api and Speaker Pronunciation Time-db

نویسندگان

چکیده

منابع مشابه

CAPT and its Effect on English Language Pronunciation Enhancement: Evidence from Bilinguals and Monolinguals

Game-based Teaching of Stress Placement on Multi-syllabic English Words

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Articulatory feature-based conditional pronunciation modeling for speaker verification

Improving pronunciation modeling for non-native speech recognition

عنوان ژورنال:

اشتراک گذاری