Shared resources for robust speech-to-text technology
نویسندگان
چکیده
This paper describes ongoing efforts at Linguistic Data Consortium to create shared resources for improved speech-totext technology. Under the DARPA EARS program, technology providers are charged with creating STT systems whose outputs are substantially richer and much more accurate than is currently possible. These aggressive program goals motivate new approaches to corpus creation and distribution. EARS participants require multilingual broadcast and telephone speech data, transcripts and annotations at a much higher volume than for any previous program. While standard approaches to resource collection and creation are prohibitively expensive for this volume of material, within EARS new methods have been established to allow for the development of vast quantities of audio, transcripts and annotations. New distribution methods also provide for efficient deployment of needed resources to participating research sites as well as enabling eventual publication to a wider community of language researchers.
منابع مشابه
Linguistic Resources for Effective, Affordable, Reusable Speech-to-Text
This paper describes ongoing efforts at Linguistic Data Consortium to create shared evaluation resources for improved speech-to-text technology. The DARPA EARS Program (Effective, Affordable, Reusable Speech-to-Text) is focused on enabling core STT technology to produce rich, highly accurate output in a range of languages and speaking styles. The aggressive EARS program goals motivate new appro...
متن کاملL2 Learners’ Lexical Inferencing: Perceptual Learning Style Preferences, Strategy Use, Density of Text, and Parts of Speech as Possible Predictors
This study was intended first to categorize the L2 learners in terms of their learning style preferences and second to investigate if their learning preferences are related to lexical inferencing. Moreover, strategies used for lexical inferencing and text related issues of text density and parts of speech were studied to determine their moderating effects and the best predictors of lexical infe...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملTools and resources for Romanian text-to-speech and speech-to-text applications
In this paper we introduce a set of resources and tools aimed at providing support for natural language processing, text-to-speech synthesis and speech recognition for Romanian. While the tools are general purpose and can be used for any language (we successfully trained our system for more than 50 languages and participated in the Universal Dependencies Shared Task), the resources are only rel...
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کامل