This is a placeholder. Final title will be filled later
نویسندگان
چکیده
The Synface project is developing a synthetic talking face to aid the hearing impaired in telephone conversation. This report investigates the gain in intelligibility from the synthetic talking head when controlled by hand-annotated speech. Audio from Swedish, English and Dutch sentences was degraded to simulate the information losses that arise in severe-to-profound hearing impairment. 12 normal-hearing native speakers for each language took part. Auditory signals were presented alone, with the synthetic face, and with a video of the original talker. Purely auditory intelligibility was low. With the addition of the synthetic face, average intelligibility increased by 20%. Scores with the synthetic face were significantly lower than for the natural face for English and Dutch, but not Swedish. Visual identification of English consonants showed that the synthetic face fell short of a natural face on both place and manner of articulation. This information will be used to improve the synthesis.
منابع مشابه
This is a placeholder. Final title will be filled later
This paper presents a model to predict the phrase commands of the Fujisaki Model for F0 contour for the Portuguese Language. Phrase commands location in text is governed by a set of weighted rules. The amplitude (Ap) and timing (T0) of the phrase commands are predicted in separate neural networks. The features for both neural networks are discussed. Finally a comparison between target and predi...
متن کاملThis is a placeholder. Final title will be filled later
Solutions proposed in bibliography for multiple user allocation on the sub-bands of an OFDM system, adopting multiple antennas, require highly computational effort and consider delay insensitive applications. Our approach tends to overcome all these limitations relaxing some hypothesis in order to give a feasible solution. The proposed algorithm can be applied to a real multiple antenna OFDM sy...
متن کاملTODO: This is a placeholder. Final title will be filled later
Classification performance for emotional user states found in the few realistic, spontaneous databases available is as yet not very high. We present a database with emotional children’s speech in a human-robot scenario. Baseline classification performance for seven classes is 44.5%, for four classes 59.2%. We discuss possible strategies for tuning, e.g., using only prototypes (based on annotati...
متن کاملTODO: This is a placeholder. Final title will be filled later
We report work on mapping the acoustic speech signal, parametrized using Mel Frequency Cepstral Analysis, onto electromagnetic articulography trajectories from the MOCHA database. We employ the machine learning technique of Support Vector Regression, contrasting previous works that applied Neural Networks to the same task. Our results are comparable to those older attempts, even though, due to ...
متن کاملThis is a placeholder. Final title will be filled later
Recent auditory physiological evidence points to a modulation frequency dimension in the auditory cortex. This dimension exists jointly with the tonotopic acoustic frequency dimension. Thus, audition can be considered as a relatively slowly-varying two-dimensional representation, the “modulation spectrum,” where the first dimension is the well-known acoustic frequency and the second dimension i...
متن کاملThis is a placeholder. Final title will be filled later
Sine-wave speech (SWS) is a three-tone replica of speech, conventionally created by matching each constituent sinusoid in amplitude and frequency with the corresponding vocal tract resonance (formant). We propose an alternative technique where we take a high-quality multicomponent sinusoidal representation and decimate this model so that there are only three components per frame. In contrast to...
متن کامل