“CDG LAB”: a Toolbox for Dependency Grammars and Dependency Treebanks Development
نویسندگان
چکیده
We present “CDG LAB”, a toolkit for development of dependency grammars and treebanks. It uses the Categorial Dependency Grammars (CDG) as a formal model of dependency grammars. CDG are very expressive. They generate unlimited dependency structures, are analyzed in polynomial time and are conservatively extendable by regular type expressions without loss of parsing efficiency. Due to these features, they are well adapted to definition of large scale grammars. CDG LAB supports the analysis of correctness of treebanks developed in parallel with evolving grammars.
منابع مشابه
"CDG Lab": An Integrated Environment for Categorial Dependency Grammar and Dependency Treebank Development
We present “CDG Lab”, an integrated environment for development of dependency grammars and treebanks. It uses the Categorial Dependency Grammars (CDG) as a formal model of dependency grammars. CDG are very expressive. They generate unlimited dependency structures, are analyzed in polynomial time and are conservatively extendable by regular type expressions without loss of parsing efficiency. Du...
متن کاملCategorial Dependency Grammars with Iterated Sequences
Some dependency treebanks use special sequences of dependencies where main arguments are mixed with separators. Classical Categorial Dependency Grammars (CDG) do not allow this construction because iterative dependency types only introduce the iterations of the same dependency. An extension of CDG is defined here that introduces a new construction for repeatable sequences of one or several depe...
متن کاملValidation Issues induced by an Automatic Pre-Annotation Mechanism in the Building of Non-projective Dependency Treebanks
In order to build large dependency treebanks using the CDG Lab, a grammar-based dependency treebank development tool, an annotator usually has to fill a selection form before parsing. This step is usually necessary because, otherwise, the search space is too big for long sentences and the parser fails to produce at least one solution. With the information given by the annotator on the selection...
متن کاملCategorial Dependency Grammars: from Theory to Large Scale Grammars
Categorial Dependency Grammars (CDG) generate unlimited projective and non-projective dependency structures, are completely lexicalized and analyzed in polynomial time. We present an extension of the CDG, also analyzed in polynomial time and dedicated for large scale dependency grammars. We define for the extended CDG a specific method of “Structural Bootstrapping” consisting in incremental con...
متن کاملConverting Dependency Structures to Phrase Structures
Treebanks are of two types according to their annotation schemata: phrase-structure Treebanks such as the English Penn Treebank [8] and dependency Treebanks such as the Czech dependency Treebank [6]. Long before Treebanks were developed and widely used for natural language processing, there had been much discussion of comparison between dependency grammars and context-free phrasestructure gramm...
متن کامل