Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser
نویسندگان
چکیده
We introduce two first-order graph-based dependency parsers achieving a new state of the art. The first is a consensus parser built from an ensemble of independently trained greedy LSTM transition-based parsers with different random initializations. We cast this approach as minimum Bayes risk decoding (under the Hamming cost) and argue that weaker consensus within the ensemble is a useful signal of difficulty or ambiguity. The second parser is a “distillation” of the ensemble into a single model. We train the distillation parser using a structured hinge loss objective with a novel cost that incorporates ensemble uncertainty estimates for each possible attachment, thereby avoiding the intractable crossentropy computations required by applying standard distillation objectives to problems with structured outputs. The first-order distillation parser matches or surpasses the state of the art on English, Chinese, and German.
منابع مشابه
Reverse Revision and Linear Tree Combination for Dependency Parsing
Deterministic transition-based Shift/Reduce dependency parsers make often mistakes in the analysis of long span dependencies (McDonald & Nivre, 2007). Titov and Henderson (2007) address this accuracy drop by using a beam search instead of a greedy algorithm for predicting the next parser transition. We propose a parsing method that allows reducing several of these errors, although maintaining a...
متن کاملImproving Dependency Parsers using Combinatory Categorial Grammar
Subcategorization information is a useful feature in dependency parsing. In this paper, we explore a method of incorporating this information via Combinatory Categorial Grammar (CCG) categories from a supertagger. We experiment with two popular dependency parsers (Malt and MST) for two languages: English and Hindi. For both languages, CCG categories improve the overall accuracy of both parsers ...
متن کاملHybrid Combination of Constituency and Dependency Trees into an Ensemble Dependency Parser
Dependency parsing has made many advancements in recent years, in particular for English. There are a few dependency parsers that achieve comparable accuracy scores with each other but with very different types of errors. This paper examines creating a new dependency structure through ensemble learning using a hybrid of the outputs of various parsers. We combine all tree outputs into a weighted...
متن کاملPartial Accuracy Rates and Agreements of Parsers: Two Experiments With Ensemble Parsing of Czech
We present two experiments with ensemble parsing, in which we obtain a 1.4% improvement of UAS compared to the best parser. We use five parsers: MateParser, TurboParser, Parsito, MaltParser a MSTParser, and the data of the analytical layer of Prague Dependency Treebank (1.5 million tokens). We split training data into 10 data-splits and run a 10-fold cross-validation scheme with each of the fiv...
متن کاملHebrew Dependency Parsing: Initial Results
We describe a newly available Hebrew Dependency Treebank, which is extracted from the Hebrew (constituency) Treebank. We establish some baseline unlabeled dependency parsing performance on Hebrew, based on two state-of-the-art parsers, MST-parser and MaltParser. The evaluation is performed both in an artificial setting, in which the data is assumed to be properly morphologically segmented and P...
متن کامل