Dependency Tree Translation: Syntactically Informed Phrasal SMT
نویسندگان
چکیده
We describe a novel approach to statistical machine translation that combines syntactic information in the source language with recent advances in phrasal translation. We depend on a source-language dependency parser and a word-aligned parallel corpus. The only target language resource assumed is a word breaker. These are used to produce treelet (“phrase”) translation pairs as well as several models, including a channel model, an order model, and a target language model. Together these models and the treelet translation pairs provide a powerful and promising approach toMT that incorporates the power of phrasal SMT with the linguistic generality available in a parser. We evaluate two decoding approaches, one inspired by dynamic programming and the other employing an A* search, comparing the results under a variety of settings.
منابع مشابه
Dependency Treelet Translation: Syntactically Informed Phrasal SMT
We describe a novel approach to statistical machine translation that combines syntactic information in the source language with recent advances in phrasal translation. This method requires a source-language dependency parser, target language word segmentation and an unsupervised word alignment component. We align a parallel corpus, project the source dependency parse onto the target sentence, e...
متن کاملMicrosoft Research Treelet Translation System: Meeting Of The North American Association For Computational Linguistics 2006 Europarl Evaluation
The Microsoft Research translation system is a syntactically informed phrasal SMT system that uses a phrase translation model based on dependency treelets and a global reordering model based on the source dependency tree. These models are combined with several other knowledge sources in a log-linear manner. The weights of the individual components in the loglinear model are set by an automatic ...
متن کاملMicrosoft Research Treelet Translation System: NAACL 2006 Europarl Evaluation
The Microsoft Research translation system is a syntactically informed phrasal SMT system that uses a phrase translation model based on dependency treelets and a global reordering model based on the source dependency tree. These models are combined with several other knowledge sources in a log-linear manner. The weights of the individual components in the loglinear model are set by an automatic ...
متن کاملPhrase-Based SMT with Shallow Tree-Phrases
In this article, we present a translation system which builds translations by gluing together Tree-Phrases, i.e. associations between simple syntactic dependency treelets in a source language and their corresponding phrases in a target language. The Tree-Phrases we use in this study are syntactically informed and present the advantage of gathering source and target material whose words do not h...
متن کاملA Novel Reordering Model for Statistical Machine Translation
Word reordering is one of the fundamental problems of machine translation, and an important factor of its quality and efficiency. In this paper, we introduce a novel reordering model based on an innovative structure, named, phrasal dependency tree including syntactical and statistical information in context of a log-linear model. The phrasal dependency tree is a new modern syntactic structure b...
متن کامل