Parser Evaluation Using Derivation Trees: A Complement to evalb
نویسندگان
چکیده
This paper introduces a new technique for phrase-structure parser analysis, categorizing possible treebank structures by integrating regular expressions into derivation trees. We analyze the performance of the Berkeley parser on OntoNotes WSJ and the English Web Treebank. This provides some insight into the evalb scores, and the problem of domain adaptation with the web data. We also analyze a “test-ontrain” dataset, showing a wide variance in how the parser is generalizing from different structures in the training material.
منابع مشابه
Empirical Evaluation of Tree distances for Parser Evaluation
In this empirical study, I compare various tree distance measures – originally developed in computational biology for the purpose of tree comparison – for the purpose of parser evaluation. I will control for the parser setting by comparing the automatically generated parse trees from the stateof-the-art parser (Charniak, 2000) with the gold-standard parse trees. The article describes two differ...
متن کاملTree Distance and Some Other Variants of Evalb
Abstract Some alternatives to the standard evalb measures for parser evaluation are considered, principally the use of a tree-distance measure, which assigns a score to a linearity and ancestry respecting mapping between trees, in contrast to the evalb measures, which assign a score to a span preserving mapping. Analysis of the evalb measures suggests the other variants, concerning different no...
متن کاملImproved Parsing for Argument-Clusters Coordination
Syntactic parsers perform poorly in prediction of Argument-Cluster Coordination (ACC). We change the PTB representation of ACC to be more suitable for learning by a statistical PCFG parser, affecting 125 trees in the training set. Training on the modified trees yields a slight improvement in EVALB scores on sections 22 and 23. The main evaluation is on a corpus of 4th grade science exams, in wh...
متن کاملEffective Parsing Using Competing CFG Rules
In this paper a new pruning method for a rule-based parser is described that relies on separating the underlying grammar rules into several mutually competing levels. This method has been developed and exploited for Czech in the syntactic parser Synt to reduce the number of possible output derivation trees. The algorithm behind operates on a so called packed forest of trees, a compressing data ...
متن کاملOnline Graph Planarisation for Synchronous Parsing of Semantic and Syntactic Dependencies
This paper investigates a generative history-based parsing model that synchronises the derivation of non-planar graphs representing semantic dependencies with the derivation of dependency trees representing syntactic structures. To process non-planarity online, the semantic transition-based parser uses a new technique to dynamically reorder nodes during the derivation. While the synchronised de...
متن کامل