XTAG - A Graphical Workbench for Developing Tree-Adjoining Grammars
نویسندگان
چکیده
We describe a workbench (XTAG) for the development of tree-adjoining grammars and their parsers, and discuss some issues that arise in the design of the graphical interface. Contrary to string rewriting grammars generating trees, the elementary objects manipulated by a tree-adjoining grammar are extended trees (i.e. trees of depth one or more) which capture syntactic information of lexical items. The unique characteristics of tree-adjoining grammars, its elementary objects found in the ~ lexicon (extended trees) and the derivational history of derived trees (also a tree), require a specially crafted interface in which the perspective has Shifted from a string-based to a tree-based system. XTAG provides such a graphical interface in which the elementary objects are trees (or tree sets) and not symbols (or strings). The kernel of XTA G is a predictive left to right parser for unification-based tree-adjoining grammar [Schabes, 1991]. XTAG includes a graphical editor for trees, a graphical tree printer, utilities for manipulating and displaying feature structures for unification-based tree-adjoining grammar, facilities for keeping track of the derivational history of TAG trees combined with adjoining and substitution, a parser for unification based tree-adjoining grammars, utilities for defining grammars and lexicons for tree-adjoining grammars, a morphological recognizer for English (75 000 stems deriving 280 000 inflected forms) and a tree-adjoining grammar for English that covers a large range of linguistic phenomena. Considerations of portability, efficiency, homogeneity and ease of maintenance, lead us to the use of Common Lisp without its object language addition and to the use of the X Window interface to Common Lisp (CLX) for the implementation of XTAG. XTA G without the large morphological and syntactic lexicons is public domain software. The large morphological and syntactic lexicons can be obtained through an agreement with ACL's Data Collection Initiative. are tree-rewriting systems in which the syntactic properties of words are encoded as tree structured-objects of extended size. TAG trees can be combined with adjoining and substitution to form new derived trees. 1 Tree-adjoining grammar differs from more traditional tree-generating systems such as context-free grammar in two ways: 1. The objects combined in a tree-adjoining grammar (by adjoining and substitution) are trees and not strings. In this approach, the lexicon associates with a word the entire structure it selects (as shown in Figure 1) and not just a (non-terminal) symbol as in context-free grammars. 2. Unlike string-based systems such as context-free grammars, two objects are built when …
منابع مشابه
Status of the XTAG
Technical Report TALANA-RT-94-01, TALANA, Universite' Paris 7, 1994. Status of the XTAG System C. Doran, D. Egedi, B. A. Hockey, B. Srinivas Institute for Research in Cognitive Science University of Pennsylvania Philadelphia, PA 19104-6228, USA fcdoran, egedi, beth, [email protected] Abstract XTAG is an ongoing project to develop a wide-coverage grammar for English, based on the Featur...
متن کاملStatus of the XTAG System
Appears in the 3e Colloque International sur les grammaires d'Arbres Adjoints (TAG+3). Abstract XTAG is an ongoing project to develop a wide-coverage grammar for English, based on the Feature-based Lex-icalized Tree Adjoining Grammar (FB-LTAG) formalism. The XTAG system integrates a morphological analyzer, an N-best part-of-speech tagger, an Early-style parser and an X-window interface, along w...
متن کاملTools And Resources For Tree Adjoining Grammars
This paper presents a workbench for Tree Adjoining Grammars that we are currently developing. This workbench includes several tools and resources based on the markup language XML, used as a convenient language to format and exchange linguistic resources.
متن کاملA Python-based Interface for Wide Coverage Lexicalized Tree-adjoining Grammars
This paper describes the design and implementation of a Python-based interface for wide coverage Lexicalized Tree-adjoining Grammars. The grammars are part of the XTAGGrammar project at the University of Pennsylvania, which were hand-written and semi-automatically curated to parse real-world corpora. We provide an interface to the wide coverage English and Korean XTAG grammars. Each XTAG gramma...
متن کاملLTAG Workbench : A General Framework for LTAG ( with Tool demonstration ) Patrice
This paper presents the LTAG Workbench, a set of graphical tools and parsers freely available for LTAG. The system can be view as a modern alternative to the XTAG system. We present rst the outlines of the workbench including diierent graphical editors and two chart parsers. The encoding of resources and results is based on an XML application called TagML. We present then future works dedicated...
متن کامل