Linguistic Resources for Effective, Affordable, Reusable Speech-to-Text
نویسنده
چکیده
This paper describes ongoing efforts at Linguistic Data Consortium to create shared evaluation resources for improved speech-to-text technology. The DARPA EARS Program (Effective, Affordable, Reusable Speech-to-Text) is focused on enabling core STT technology to produce rich, highly accurate output in a range of languages and speaking styles. The aggressive EARS program goals motivate new approaches to corpus creation and distribution. EARS research sites require multilingual broadcast news and telephone speech, transcripts and annotations at a much higher volume than for any previous technology program. In response to these demands, LDC has developed new corpora for training and evaluating speech-to-text systems in English, Arabic and Chinese and to support systems that distinguish speakers, identify and repair disfluencies and punctuate a text to improve readability.
منابع مشابه
NIST Rich Transcription 2002 Evaluation: A Preview
The National Institute of Standards and Technology (NIST) has been implementing evaluations of automatic speech transcription technologies for over 15 years. NIST has helped guide progress in these technologies by: creating increasingly challenging and realistic tests, helping to provide associated linguistic resources, employing uniform metrics and analyses across systems to assess performance...
متن کاملConstructing a Reusable Linguistic Resource for a Polyglot Speech Synthesis
This paper is about constructing sharable linguistic information to be used across languages for a Text-to-Speech (TTS) system. The data is obtained from existing resources. The focus of the paper is the phonetic and linguistic aspects. A monolingual TTS architecture is introduced with descriptions on each stage of processing. A multilingual TTS architecture is also introduced. Language depende...
متن کاملInfrastructure for Collaborative Annotation of Speech
Vast amounts of digital language data (primary data) and increasingly complex linguistic annotations (secondary data) are being created around the world with accelerating speed. There is a real risk of losing much of this data unless the compilers of language resources (primary and secondary data) and creators of tools start to pay more attention to the reusability of the resources and the inte...
متن کاملQuerying and Updating Treebanks: A Critical Survey and Requirements Analysis
Language technology makes extensive use of hierarchically annotated text and speech data. These databases are stored in flat files and manipulated using corpus specific query tool or scripts. While the size of these databases and the range of applications has grown rapidly in recent years, neither method for managing the data has led to reusable, scalable software. The formal properties of the ...
متن کاملLinguistic Resources for Speech Parsing
We report on the success of a two-pass approach to annotating metadata, speech effects and syntactic structure in English conversational speech: separately annotating transcribed speech for structural metadata, or structural events, (fillers, speech repairs ( or edit dysfluencies) and SUs, or syntactic/semantic units) and for syntactic structure (treebanking constituent structure and shallow ar...
متن کامل