Non-Projectivity in the Ancient Greek Dependency Treebank
نویسندگان
چکیده
In this paper, we provide a quantitative analysis of non-projective constructions attested in the Ancient Greek Dependency Treebank (AGDT). We consider the different types of formal constraints and metrics that have become standardized in the literature on non-projectivity (planarity, wellnestedness, gap-degree, edge-degree). We also discuss some of the linguistic factors that cause non-projective edges in Ancient Greek. Our results confirm the remarkable extension of non-projectivity in the AGDT, both in terms of quantitative incidence of non-projective nodes and for their complexity, which is not paralleled by the corpora of modern languages considered in the literature. At the same time, the usefulness of other constraint (especially well-nestedness) is confirmed by our researches.
منابع مشابه
Will a Parser Overtake Achilles? First experiments on parsing the Ancient Greek Dependency Treebank
We present a number of experiments on parsing the Ancient Greek Dependency Treebank (AGDT), i.e. the largest syntactically annotated corpus of Ancient Greek currently available (350k words ca). Although the AGDT is rather unbalanced and far from being representative of all genres and periods of Ancient Greek, no attempt has been made so far to perform automatic dependency parsing of Ancient Gre...
متن کاملAn Ownership Model of Annotation: The Ancient Greek Dependency Treebank
We describe here the first release of the Ancient Greek Dependency Treebank (AGDT), a 190,903-word syntactically annotated corpus of literary texts including the works of Hesiod, Homer and Aeschylus. While the far larger works of Hesiod and Homer (142,705 words) have been annotated under a standard treebank production method of soliciting annotations from two independent reviewers and then reco...
متن کاملInsights into Non-projectivity in Hindi
Large scale efforts are underway to create dependency treebanks and parsers for Hindi and other Indian languages. Hindi, being a morphologically rich, flexible word order language, brings challenges such as handling non-projectivity in parsing. In this work, we look at non-projectivity in Hyderabad Dependency Treebank (HyDT) for Hindi. Non-projectivity has been analysed from two perspectives: g...
متن کاملPorting an Ancient Greek and Latin Treebank
We have recently converted a dependency treebank, consisting of ancient Greek and Latin texts, from one annotation scheme to another that was independently designed. This paper makes two observations about this conversion process. First, we show that, despite significant surface differences between the two treebanks, a number of straightforward transformation rules yield a substantial level of ...
متن کاملStructured Knowledge for Low-Resource Languages: The Latin and Ancient Greek Dependency Treebanks
We describe here our work in creating treebanks – large collections of syntactically annotated data – for Latin and Ancient Greek. While the treebanks themselves present important datasets for traditional research in philology and linguistics, the layers of structured knowledge they contain (including disambiguated lemma, morphological, and syntactic information for every word) help offset the ...
متن کامل