LAF/GrAF-grounded Representation of Dependency Structures
نویسندگان
چکیده
This paper shows that a LAF/GrAF-based annotation schema can be used for the adequate representation of syntactic dependency structures in many languages. We first argue that there are at least two types of textual units that can be annotated with dependency information: words/tokens and chunks/phrases. Based on this consideration, we discuss a sub-typing of GrAF to represent the corresponding dependency structures. We then describe a wrapper program that, as a proof of concept, converts output data from different dependency parsers in proprietary XML formats to the GrAF-compliant XML representation.
منابع مشابه
A LAF/GrAF based Encoding Scheme for underspecified Representations of syntactic Annotations
Data models and encoding formats for syntactically annotated text corpora need to deal with syntactic ambiguity; underspecified representations are particularly well suited for the representation of ambiguous data because they allow for high informational efficiency. We discuss the issue of being informationally efficient, and the trade-off between efficient encoding of linguistic annotations a...
متن کاملA standardized general framework for encoding and exchange of corpus annotations: The Linguistic Annotation Framework, LAF
The Linguistic Annotation Framework, LAF, proposes a generic data model for exchange of linguistic annotations and has recently become an ISO standard (ISO 24612:2012). This paper describes some aspects of LAF, its XML-serialization GrAF and some use-cases related to the framework. While GrAF has already been used as exchange format for corpora with several annotation layers, such as MASC and O...
متن کاملThe Linguistic Annotation Framework: a standard for annotation interchange and merging
This paper overviews the International Standards Organization Linguistic Annotation Framework (ISO LAF) developed in ISO TC37 SC4. We describe the XML serialization of ISO LAF, the Graph Annotation Format (GrAF) and discuss the rationale behind the various decisions that were made in determining the standard. We describe the structure of the GrAF headers in detail and provide multiple examples ...
متن کاملGrAF: A Graph-based Format for Linguistic Annotations
In this paper we describe the Graph Annotation Format (GrAF) and show how it is used represent not only independent linguistic annotations, but also sets of merged annotations as a single graph. To demonstrate this, we have automatically transduced several different annotations of the Wall Street Journal corpus into GrAF and show how the annotations can then be merged, analyzed, and visualized ...
متن کاملTransition-Based Parsing for Deep Dependency Structures
Derivations under different grammar formalisms allow extraction of various dependency structures. Particularly, bilexical deep dependency structures beyond surface tree representation can be derived from linguistic analysis grounded by CCG, LFG, and HPSG. Traditionally, these dependency structures are obtained as a by-product of grammar-guided parsers. In this article, we study the alternative ...
متن کامل