Speech synthesis using damped sinusoids.
نویسندگان
چکیده
A speech synthesizer was developed that operates by summing exponentially damped sinusoids at frequencies and amplitudes corresponding to peaks derived from the spectrum envelope of the speech signal. The spectrum analysis begins with the calculation of a smoothed Fourier spectrum. A masking threshold is then computed for each frame as the running average of spectral amplitudes over an 800-Hz window. In a rough simulation of lateral suppression, the running average is then subtracted from the smoothed spectrum (with negative spectral values set to zero), producing a masked spectrum. The signal is resynthesized by summing exponentially damped sinusoids at frequencies corresponding to peaks in the masked spectra. If a periodicity measure indicates that a given analysis frame is voiced, the damped sinusoids are pulsed at a rate corresponding to the measured fundamental period. For unvoiced speech, the damped sinusoids are pulsed on and off at random intervals. A perceptual evaluation of speech produced by the damped sinewave synthesizer showed excellent sentence intelligibility, excellent intelligibility for vowels in /hVd/ syllables, and fair intelligibility for consonants in CV nonsense syllables.
منابع مشابه
Goodwin & Vetterli : Matching Pursuit and Atomic Signal Models
The matching pursuit algorithm can be used to derive signal decompositions in terms of the elements of a dictionary of time-frequency atoms. Using a structured overcomplete dictionary yields a signal model that is both parametric and signal-adaptive. In this paper, we apply matching pursuit to the derivation of signal expansions based on damped sinusoids. It is shown that expansions in terms of...
متن کاملAdaptive Modeling of Synthetic Nonstationary Sinusoids
Nonstationary oscillations are ubiquitous in music and speech, ranging from the fast transients in the attack of musical instruments and consonants to amplitude and frequency modulations in expressive variations present in vibrato and prosodic contours. Modeling nonstationary oscillations with sinusoids remains one of the most challenging problems in signal processing because the fit also depen...
متن کاملAn investigation of the application of dynamic sinusoidal models to statistical parametric speech synthesis
This paper applies a dynamic sinusoidal synthesis model to statistical parametric speech synthesis (HTS). For this, we utilise regularised cepstral coefficients to represent both the static amplitude and dynamic slope of selected sinusoids for statistical modelling. During synthesis, a dynamic sinusoidal model is used to reconstruct speech. A preference test is conducted to compare the selectio...
متن کاملPerceptual audio modeling with exponentially damped sinusoids
This paper presents the derivation of a new perceptual model that represents speech and audio signals by a sum of exponentially damped sinusoids. Compared to a traditional sinusoidal model, the exponential sinusoidal model (ESM) is better suited to model transient segments that are readily found in audio signals. Total least squares (TLS) algorithms are applied for the automatic extraction of t...
متن کاملUsing Resonant Filters for the Synthesis of Time-Varying Sinusoids
This paper discusses sinusoidal synthesis by means of resonant lters. Resonant recursive lters have long been used to synthesize exponentially damped sinusoids but surprisingly little has been written about stability issues when the lter coeecients are allowed to vary and interpolation problems. In this paper, we discuss some of the issues one faces when synthesizing sinusoids with time-varying...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of speech, language, and hearing research : JSLHR
دوره 45 4 شماره
صفحات -
تاریخ انتشار 2002