Universal Dependency Parser: A Single Parser for Many Languages on Arc-Swift
نویسندگان
چکیده
Dependency parsing has been a longstanding task in NLP with a recent boost in performance thanks to neural network models. However, most dependency parsers are monolingual–a single parser is trained per language–and utilize transitionbased systems that are limited to local information. We utilize a novel transitionbased system, arc-swift, proposed in [1] that incorporates both local and global information and achieves improved performance compared to current transitionbased systems. We propose a universal, multilingual dependency parser model built on arc-swift and augment this model by aligning GloVe word vectors across languages and prepending the parser with a character-level language detector model to help the parser explicitly learn language embeddings. Our results show that word vector alignments greatly improve UAS and LAS scores, while language detector predictions show modest increase in parse scores. Our universal dependency parser performs comparably to monolingual parser baselines models, and is potentially able to help advance downstream applications, such as machine translation.
منابع مشابه
Feature Engineering in Persian Dependency Parser
Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...
متن کاملArc-swift: A Novel Transition System for Dependency Parsing
Transition-based dependency parsers often need sequences of local shift and reduce operations to produce certain attachments. Correct individual decisions hence require global information about the sentence context and mistakes cause error propagation. This paper proposes a novel transition system, arc-swift, that enables direct attachments between tokens farther apart with a single transition....
متن کاملEffective Online Reordering with Arc-Eager Transitions
We present a new transition system with word reordering for unrestricted nonprojective dependency parsing. Our system is based on decomposed arc-eager rather than arc-standard, which allows more flexible ambiguity resolution between a local projective and non-local crossing attachment. In our experiment on Universal Dependencies 2.0, we find our parser outperforms the ordinary swapbased parser ...
متن کاملتولید درخت بانک سازهای زبان فارسی به روش تبدیل خودکار
Treebanks is one of important and useful resource in Natural Language Processing tasks. Dependency and phrase structures are two famous kinds of treebanks. There have already made many efforts to convert dependency structure to phrase structure. In this paper we study an approach to convert dependency structure to phrase structure because of lack of a big phrase structure Treebank in Persian. A...
متن کاملOne Parser, Many Languages
We train a language-universal dependency parser on a multilingual collection of treebanks. The parsing model uses multilingual word embeddings alongside learned and specified typological information, enabling generalization based on linguistic universals and based on typological similarities. We evaluate our parser’s performance on languages in the training set as well as on the unsupervised sc...
متن کامل