MARK TATHAM and KATHERINE MORTON COMPUTATIONAL MODELLING OF SPEECH PRODUCTION: ENGLISH RHYTHM
نویسنده
چکیده
In this paper we examine the treatment of English rhythm from both a theoretical and an experimental perspective. There are major shortcomings in the way not just rhythm but also prosodics in general is modelled; this is all too clear in various applications of the theory, particularly in computational areas such as speech synthesis (Keller and Keller 2002). Our objective is to begin characterising rhythm computationally within a hierarchically based model in which prosody is the framework for speech production in general. Such a model tightly integrates suprasegmental and segmental properties of speech in a well defined and well motivated binding process. Since the overall framework is the prosody the phonetic rendering of utterances depends on their prosodic structure; this includes the detail of the structure of the syllable – a unit of the prosody. The idea is not novel [see Firth (1948) on the prosodic structure, and Kahn (1976) and Gussenhoven (1986) on the structure of syllables, in particular the phenomenon of ambisyllabicity and detailed phonetic rendering], and it formed the basis of the conceptual design of our SPRUCE computational model (Lewis and Tatham 1991). The prosodic framework approach within SPRUCE is expressed in such a way that (a) it incorporates hooks for phonetic rendering with expressive content, and (b) it is transferable to a speech synthesis system for the kind of model testing associated with utterance synthesis. SPRUCE itself is n o t a piece of 'speech technology', rather it is a speech production model which, because of its c o m p u t a t i o n a l nature, lends itself to being the basis of a speech synthesiser proper.
منابع مشابه
A New Intonation Model for Text-to-speech Synthesis
The text-to-speech intonation model we are developing derives from both linguistics, and the acoustics and aerodynamics of speech production. Our underlying premise is that in human speech production there are physical processes intrinsic to speech production, and that some of these processes can be cognitively represented – they can therefore become part of the domain of language processing. T...
متن کاملSpeech Prosodics for Synthesis – Perspectives
Speech synthesis systems still fail in producing acceptable prosodies. We are developing a research strategy designed to de-focus attention on the objective acoustic accuracy of synthetic speech in favour of enhancing the speech to optimize a listener’s ability to repair ‘damaged’ signals. To do this we need to know more about how listeners repair errors and how we might trigger the repair proc...
متن کاملArticulatory Phonology, Task Dynamics and Computational Adequacy
This paper discusses articulatory phonology and task dynamics as potentially computationally adequate models which, together, might characterise speech production. The idea is introduced that, particularly at the task dynamic level, the object oriented computational paradigm is appropriate — this is a novel approach in speech production modelling. The paper concludes that articulatory phonology...
متن کاملSpeech Synthesis and Models of Speech Production - II
The experimental work and its relationship to our model of speech production documented in this interim report is not intended to be exhaustive in any way. The processing and writing up of experimental data is extremely time consuming and we have sought to present here notes on a few major experiments. Our final report will contain the appropriate documentation of all the Project’s experimental...
متن کامل