Analysis of Lombard speech using excitation source information
نویسندگان
چکیده
This paper examines the Lombard effect on the excitation features in speech production. These features correspond mostly to the acoustic features at subsegmental (< pitch period) level. The instantaneous fundamental frequency F0 (i.e., pitch), the strength of excitation at the instants of significant excitation and a loudness measure reflecting the sharpness of the impulse-like excitation around epochs are used to represent the excitation features at the subsegmental level. The Lombard effect influences the pitch and the loudness. The extent of Lombard effect on speech depends on the nature and level (or intensity) of the external feedback that causes the Lombard effect.
منابع مشابه
Analysis of Lombard Effect Speech and Its Application in Speaker Verification for Imposter Detection
Speaking in the presence of noise changes the characteristics of the speech produced which is known as the Lombard effect. This effect is perceptually felt with an increase in intensity of speaking. These changes in the characteristics of speech production is to ensure an intelligible communication in noisy environment. These changes also result in the performance degradation of speech systems ...
متن کاملSpeaking Style Conversion from Normal to Lombard Speech Using a Glottal Vocoder and Bayesian GMMs
Speaking style conversion is the technology of converting natural speech signals from one style to another. In this study, we focus on normal-to-Lombard conversion. This can be used, for example, to enhance the intelligibility of speech in noisy environments. We propose a parametric approach that uses a vocoder to extract speech features. These features are mapped using Bayesian GMMs from utter...
متن کاملDiscriminating Neutral and Emotional Speech using Neural Networks
In this paper, we address the issue of speaker-specific emotion detection (neutral vs emotion) from speech signals with models for neutral speech as reference. As emotional speech is produced by the human speech production mechanism, the emotion information is expected to lie in the features of both excitation source and the vocal tract system. Linear Prediction residual is used as the excitati...
متن کاملDeep neural network based trainable voice source model for synthesis of speech with varying vocal effort
This paper studies a deep neural network (DNN) based voice source modelling method in the synthesis of speech with varying vocal effort. The new trainable voice source model learns a mapping between the acoustic features and the time-domain pitch-synchronous glottal flow waveform using a DNN. The voice source model is trained with various speech material from breathy, normal, and Lombard speech...
متن کاملExtraction of Excitation Information from Speech and Its Applications for Expressive Speech Processing
Through speech production mechanism, speech with different voice qualities such as phonations, emotions, expressive singing and other paralinguistic sounds are also produced. Most of these sounds demonstrate these features mostly due to the excitation component (vibration of the vocal folds at the glottis) whereas the dynamic vocal tract system primarily conveys the message. Hence, the excitati...
متن کامل