A Corpus-based Approach to <ahem/> Expressive Speech Synthesis

نویسندگان

E. Eide

J. Pitrelli

چکیده

Human speech communication can be thought of as comprising two channels – the words themselves, and the style in which they are spoken. Each of these channels carries information. Today's most-advanced text-to-speech (TTS) systems such as [1],[2],[3],[4] fall far short of human speech because they offer only a single, fixed style of delivery, independent of the message. In this paper, we describe the IBM Expressive TTS Engine, which is able to add another channel by offering five speaking styles. These are: neutral declarative, conveying good news, conveying bad news, asking a question, and showing contrastive emphasis. In addition to generating speech in these five styles, our TTS system is also able to generate paralinguistic events such as sighs, breaths, and filled pauses which further enrich the style channel. We describe our methods for generating and evaluating expressive synthetic speech and paralinguistic effects. We show significant perceptual differences between expressive and neutral synthetic speech for each of our speaking styles. In addition, we describe how users have been empowered to easily communicate the desired expression to the TTS engine through our extensions [5] of the Speech Synthesis Markup Language (SSML) [6].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Expressive Speech Synthesis for Czech Limited Domain Dialogue System – Basic Experiments

This paper describes a development of limited domain expressive speech synthesis for the Czech language. Our current speech synthesis system is based on unit selection methods and produces high quality speech in a neutral speaking style. This work focuses on modifications made in the synthesis algorithm to integrate expressivity into generated speech. There is also introduced a listening test, ...

متن کامل

Listening-Test-Based Annotation of Communicative Functions for Expressive Speech Synthesis

This paper is focused on the evaluation of listening test that was realized with a view to objectively annotate expressive speech recordings and further develop a limited domain expressive speech synthesis system. There are two main issues to face in this task. The first matter in issue to be taken into consideration is the fact that expressivity in speech has to be defined in some way. The sec...

متن کامل

Towards synthesising expressive speech; designing and collecting expressive speech data

Corpus-based speech synthesis needs representative corpora of human speech if it is to meet the needs of everyday spoken interaction. This paper describes methods for recording such corpora, and details some difficulties (with their solutions) found in the use of spontaneous speech data for synthesis.

متن کامل

Automatic exploration of corpus-specific properties for expressive text-to-speech: a case study in emphasis

In this paper we explore an approach to expressive text-tospeech synthesis in which pre-existing expression-specific corpora are complemented with automatically generated labels to augment the search space of units the engine can exploit to increase its expressiveness. We motivate this data-discovery approach as an alternative to an approach guided by data collection, in order to harness the fu...

متن کامل

Formal expressive indiscernibility underlying a prosodic deformation model

We are here concerned by the setting up of a model and a formalism for expressive speech synthesis under the paradigm of a corpus-based approach. Our objective is to apply prosodic expressive forms, acquired from natural human-reading recordings, on a new textual matter. We outline a general model for speech expressiveness. Then we deal with some formal aspects of expressive representation. We ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

A Corpus-based Approach to <ahem/> Expressive Speech Synthesis

نویسندگان

چکیده

منابع مشابه

Expressive Speech Synthesis for Czech Limited Domain Dialogue System – Basic Experiments

Listening-Test-Based Annotation of Communicative Functions for Expressive Speech Synthesis

Towards synthesising expressive speech; designing and collecting expressive speech data

Automatic exploration of corpus-specific properties for expressive text-to-speech: a case study in emphasis

Formal expressive indiscernibility underlying a prosodic deformation model

عنوان ژورنال:

اشتراک گذاری