Singing Voice Analysis/Synthesis

نویسندگان

  • Youngmoo Edmund Kim
  • Andrew B. Lippman
چکیده

The singing voice is the oldest and most variable of musical instruments. By combining music, lyrics, and expression, the voice is able to affect us in ways that no other instrument can. As listeners, we are innately drawn to the sound of the human voice, and when present it is almost always the focal point of a musical piece. But the acoustic flexibility of the voice in intimating words, shaping phrases, and conveying emotion also makes it the most difficult instrument to model computationally. Moreover, while all voices are capable of producing the common sounds necessary for language understanding and communication, each voice possesses distinctive features independent of phonemes and words. These unique acoustic qualities are the result of a combination of innate physical factors and expressive characteristics of performance, reflecting an individual’s vocal identity. A great deal of prior research has focused on speech recognition and speaker identification, but relatively little work has been performed specifically on singing. There are significant differences between speech and singing in terms of both production and perception. Traditional computational models of speech have focused on the intelligibility of language, often sacrificing sound quality for model simplicity. Such models, however, are detrimental to the goal of singing, which relies on acoustic authenticity for the non-linguistic communication of expression and emotion. These differences between speech and singing dictate that a different and specialized representation is needed to capture the sound quality and musicality most valued in singing. This dissertation proposes an analysis/synthesis framework specifically for the singing voice that models the time-varying physical and expressive characteristics unique to an individual voice. The system operates by jointly estimating source-filter voice model parameters, representing vocal physiology, andmodeling the dynamic behavior of these features over time to represent aspects of expression. This framework is demonstrated to be useful for several applications, such as singing voice coding, automatic singer identification, and voice transformation. Thesis supervisor: Barry L. Vercoe, D.M.A. Title: Professor of Media Arts and Sciences

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of acoustic features affecting "singing-ness" and its application to singing-voice synthesis from speaking-voice

To construct a natural singing-voice synthesis system, it is important to adequately control acoustic features such as fundamental frequency (F0), spectrum shapes, and phoneme duration in the synthesis method. This paper reveals acoustic features affecting singing-voice perception by comparative analyzing singingand speaking-voices, and then proposes a transforming method from speaking-voice in...

متن کامل

Control Methods of Acoustic Parameters for Singing-Voice Synthesis

To construct a natural singing voice synthesis system, it is important to adequately control acoustic parameters such as fundamental frequency (F0), phoneme duration, and spectrum information in the synthesis method, based on comparative analysis between spokenand singing-voices. This paper proposes a transformation from read speech into singing-voice using STRAIGHT. This method is composed of ...

متن کامل

A study on the pitch pattern of a singing voice synthesis system based on the cepstral method

We synthesize singing voice by rule based on cepstral method. Higher accuracy of analysis and synthesis is required to synthesize singing voice, comparing to rule-based speech synthesis. In this paper, we propose a method of analysis and synthesis with high accuracy. Also, we express pitch patterns minutely by curves that close to natural pitch by using this method. We apply Fujisaki model and ...

متن کامل

A Framework for Parametric Singing Voice Analysis/synthesis

The singing voice is the most variable and flexible of musical instruments. All voices are capable of producing the common phonemes necessary for language understanding and communication, yet each voice possesses distinctive qualities that are seemingly independent of phonemes and words. The unique acoustic qualities of an individual singer’s voice arise from a combination of innate physical fa...

متن کامل

Extraction of F0 Dynamic Characteristics and Development of F0 Control Model in Singing Voice

Fundamental frequency (F0) control models, which can cope with F0 dynamic characteristics related to singing-voice perception, are required to construct natural singing-voice synthesis systems. This paper discusses the importance of F0 dynamic characteristics in singing voices and demonstrates how much it influence on singing voice perception through psychoacoustic experiments. This paper, then...

متن کامل

HMM-based Mandarin Singing Voice Synthesis Using Tailored Synthesis Units and Question Sets

Fluency and continuity properties are essential in synthesizing a high quality singing voice. In order to synthesize a smooth and continuous singing voice, the Hidden Markov Model-based synthesis approach is employed in this study to construct a Mandarin singing voice synthesis system. The system is designed to generate Mandarin songs with arbitrary lyrics and melody in a certain pitch range. I...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003