Capturing Language Speci c Constraints on Lexical Selection with Feature-Based Lexicalized Tree-Adjoining Grammars
نویسندگان
چکیده
The success of a Machine Translation (MT) application depends on its ability to perform lexical selection , that is, to choose lexical items in the target language that most closely match the lexical items in the input source. This task is particularly dif-cult in cases, such as those which arise in translating from English to Chinese and Korean, where the target language imposes lexical constraints which are non-existent or completely diierent in the source. We present an implementation of an English-Korean MT system using Feature-Based, Lexicalized Tree-Adjoining Grammar (FB-LTAG), and demonstrate its ability to handle diiculties involving lexical selection between those two languages. We also describe the applicability of this approach to similar issues which arise in English-Chinese translation. By building language-dependent FB-LTAGs for each language and then linking them via a Synchronous Tree-Adjoining Grammar (STAG), we are able to elegantly model the speciic and language-dependent syntactic and semantic distinctions necessary to lter the choice of target lexical items.
منابع مشابه
Extraction of Tree Adjoining Grammars from a Treebank for Korean
We present the implementation of a system which extracts not only lexicalized grammars but also feature-based lexicalized grammars from Korean Sejong Treebank. We report on some practical experiments where we extract TAG grammars and tree schemata. Above all, full-scale syntactic tags and well-formed morphological analysis in Sejong Treebank allow us to extract syntactic features. In addition, ...
متن کاملEncoding Lexicalized Tree Adjoining Grammars with a Nonmonotonic Inheritance Hierachy
This paper shows how DATR, a widely used formal language for lexical knowledge representation , can be used to define an I_TAG lexicon as an inheritance hierarchy with internal lexical rules. A bottom-up featu-ral encoding is used for LTAG trees and this allows lexical rules to be implemented as covariation constraints within feature structures. Such an approach eliminates the considerable redu...
متن کاملEncoding Lexicalized Tree Adjoining Grammars with a Nonmonotonic Inheritance Hierarchy
This paper shows how DATR, a widely used formal language for lexical knowledge representation, can be used to define an LTAG lexicon as an inheritance hierarchy with internal lexical rules. A bottom-up featural encoding is used for LTAG trees and this allows lexical rules to be implemented as covariation constraints within feature structures. Such an approach eliminates the considerable redunda...
متن کاملA Python-based Interface for Wide Coverage Lexicalized Tree-adjoining Grammars
This paper describes the design and implementation of a Python-based interface for wide coverage Lexicalized Tree-adjoining Grammars. The grammars are part of the XTAGGrammar project at the University of Pennsylvania, which were hand-written and semi-automatically curated to parse real-world corpora. We provide an interface to the wide coverage English and Korean XTAG grammars. Each XTAG gramma...
متن کاملSome Experiments on Indicators of Parsing Complexity for Lexicalized Grammars
In this paper, we identify syntactic lexical ambiguity and sentence complexity as factors that contribute to parsing complexity in fully lexicalized grammar formalisms such as Lexicalized Tree Adjoining Grammars. We also report on experiments that explore the effects of these factors on parsing complexity. We discuss how these constraints can be exploited in improving efficiency of parsers for ...
متن کامل