Using articulatory position data in voice transformation
نویسندگان
چکیده
Articulatory position data is information about the location of various articulators in the vocal tract. One form of it has been made freely available in the MOCHA database [1]. This data is interesting in that it provides direct information on the production of speech, but there is the question of whether it actually provides information beyond what can be derived from the audio signal, which is much easier to collect. Although there has been some success in improving small-scale speech recognition and in demonstrating mappings between articulatory positions and spectral features of the audio signal, there are many problems to which this data has not been applied. This work investigates the possibility of using articulatory position data to improve voice transformation, which is the process of making speech from one person sound as if it had been spoken by another. After further investigation, it appears to be difficult to use articulatory position data to improve voice transformation using state-of-the-art voice transformation techniques as we only had a few positive results across a range of experiments. To achieve these results, it was necessary to modify our baseline voice transformation approach and/or consider features derived from the articulatory positions.
منابع مشابه
Using Articulatory Position Data to Improve Voice Transformation
Voice transformation (also known as voice conversion or voice morphing) is a name given to techniques which take speech from one speaker as input and attempt to produce speech that sounds like it came from another speaker. One compelling argument for good voice transformation is that it reduces the difficulty in creating additional synthetic voices with new identities and styles once an existin...
متن کاملDirect Speech Generation for a Silent Speech Interface based on Permanent Magnet Articulography
Patients with larynx cancer often lose their voice following total laryngectomy. Current methods for post-laryngectomy voice restoration are all unsatisfactory due to different reasons: requires frequent replacement due to biofilm growth (tracheo-oesoephageal valve), speech sounds gruff and masculine (oesophageal speech) or robotic (electro-larynx) and, in general, are difficult to master (oeso...
متن کاملVoice mimic system using an articulatory codebook for estimation of vocal tract shape
Voice mimic systems using articulatory codebooks require an initial estimate of the vocal tract shape in the vicinity of the global optimum. For this purpose, we need to gather a large set of corresponding articulatory and acoustic data in the articulatory codebook. Thus, searching and accessing the codebook becomes a di cult task. In this paper, the design of an articulatory codebook is presen...
متن کاملEvaluation of a Silent Speech Interface Based on Magnetic Sensing and Deep Learning for a Phonetically Rich Vocabulary
To help people who have lost their voice following total laryngectomy, we present a speech restoration system that produces audible speech from articulator movement. The speech articulators are monitored by sensing changes in magnetic field caused by movements of small magnets attached to the lips and tongue. Then, articulator movement is mapped to a sequence of speech parameter vectors using a...
متن کاملArticulatory-Spectral-Temporal Relations in Cantonese Vowels
This paper investigates the articulatory-spectral-temporal relations in the long [iː uː aː] and mediumlong [i u a] point vowels in Cantonese, through analyses of (i) the tongue positions in the oral cavity by means of the Electromagnetic Midsagittal Articulography (EMMA AG500) and (ii) the corresponding formant frequencies for the vowels using the speech analysis software, Computerized Speech L...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007