Domain Specific Speech Acts for Spoken Language Translation
نویسندگان
چکیده
We describe a coding scheme for machine translation of spoken taskoriented dialogue. The coding scheme covers two levels of speaker intention − domain independent speech acts and domain dependent domain actions. Our database contains over 14,000 tagged sentences in English, Italian, and German. We argue that domain actions, and not speech acts, are the relevant discourse unit for improving translation quality. We also show that, although domain actions are domain specific, the approach scales up to large domains without an explosion of domain actions and can be coded with high inter-coder reliability across research sites. Furthermore, although the number of domain actions is on the order of ten times the number of speech acts, sparseness is not a problem for the training of classifiers for identifying the domain action. We describe our work on developing high accuracy speech act and domain action classifiers, which is the core of the source language analysis module of our NESPOLE machine translation system.
منابع مشابه
Enriching Spoken Language Translation with Dialog Acts
Current statistical speech translation approaches predominantly rely on just text transcripts and do not adequately utilize the rich contextual information such as conveyed through prosody and discourse function. In this paper, we explore the role of context characterized through dialog acts (DAs) in statistical translation. We demonstrate the integration of the dialog acts in a phrase-based st...
متن کاملAn interlingua based on domain actions for machine translation of task-oriented dialogues
This paper describes an interlingua for spoken language translation that is based on domain actions in the travel planning domain. Domain actions are composed of speech acts (e.g., requestinformation), attributes (e.g., size, price), and objects (e.g., hotel, flight) and can take arguments. Development of the interlingua is guided by a database containing travel dialogues in English, Korean, Ja...
متن کاملRapid Portability among Domains in an Interactive Spoken Language Translation System
Spoken Language Translation systems have usually been produced for such specific domains as health care or military use. Ideally, such systems would be easily portable to other domains in which translation is mission critical, such as emergency response or law enforcement. However, porting has in practice proven difficult. This paper will comment on the sources of this difficulty and briefly pr...
متن کاملPseudo-morpheme and Confusion Network Based Korean-english Statistical Spoken Language Translation System
In this demonstration, we present POSSLT (POSTECH Spoken Language Translation) for a Korean-English statistical spoken language translation (SLT) system using pseudo-morpheme and confusion network (CN) based technique. Like most other SLT systems, automatic speech recognition (ASR) and machine translation (MT) are coupled in a cascading manner in our SLT system. We used confusion network based ...
متن کاملJANUS: a Multi-lingual Speech-to-speech Translation System for Spontaneously Spoken Language in a Limited Domain
Janus is a multilingual speech translation system currently operating in the domain of meeting scheduling. Translating spontaneous speech requires a high degree of robustness to overcome the dissuencies of spoken language as well as errors in speech recognition. In this system description, we focus on the robust speech translation components in Janus|the skipping GLR* parser, the segmentation o...
متن کامل