Speech Synthesis with Neural Networks
نویسندگان
چکیده
Text-to-speech conversion has traditionally been performed either by concatenating short samples of speech or by using rule-based systems to convert a phonetic representation of speech into an acoustic representation, which is then converted into speech. This paper describes a system that uses a time-delay neural network (TDNN) to perform this phonetic-to-acoustic mapping, with another neural network to control the timing of the generated speech. The neural network system requires less memory than a concatenation system, and performed well in tests comparing it to commercial systems using other technologies.
منابع مشابه
Prediction of Gain in LD-CELP Using Hybrid Genetic/PSO-Neural Models
In this paper, the gain in LD-CELP speech coding algorithm is predicted using three neural models, that are equipped by genetic and particle swarm optimization (PSO) algorithms to optimize the structure and parameters of neural networks. Elman, multi-layer perceptron (MLP) and fuzzy ARTMAP are the candidate neural models. The optimized number of nodes in the first and second hidden layers of El...
متن کاملSpeech synthesis using warped linear prediction and neural networks
A text-to-speech synthesis technique, based on warped linear prediction (WLP) and neural networks, is presented for high-quality individual sounding synthetic speech. Warped linear prediction is used as a speech production model with wide audio bandwidth yet with highly compressed control parameter data. An excitation codebook, inverse filtered from a target speaker’s voice, is applied to obtai...
متن کاملPrediction of Gain in LD-CELP Using Hybrid Genetic/PSO-Neural Models
In this paper, the gain in LD-CELP speech coding algorithm is predicted using three neural models, that are equipped by genetic and particle swarm optimization (PSO) algorithms to optimize the structure and parameters of neural networks. Elman, multi-layer perceptron (MLP) and fuzzy ARTMAP are the candidate neural models. The optimized number of nodes in the first and second hidden layers of El...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملSpeech Synthesis by Artificial Neural Networks
This research was accomplished during my studies at University of Sussex under supervision of Dr Si Wu. The whole project was focused on speech synthesis by artificial neural networks. The objective of the whole research is to implement an improvement of the speech synthesis process. A different approach on some speech synthesis procedures was introduced by neural networks. That approach was in...
متن کاملApplication of Neural Networks for POS Tagging and Intonation Control in Speech Synthesis for Polish
The paper describes use of neural networks in POS (part-of-speech) tagging and intonation control, needed in a speech synthesis system for the Polish language. Feedforward multilayered perceptrons have been proposed for both purposes. Considerations during planning the network architecture, used training data, training process and verification of the results are described.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره cs.NE/9811031 شماره
صفحات -
تاریخ انتشار 1996