Extracting Dependency Trees from Sanskrit Texts
نویسنده
چکیده
In this paper, I describe a hybrid dependency tree parser for Sanskrit sentences improving on a purely lexical parsing approach through simple syntactic rules and grammatical information. The performance of the parser is demonstrated on a group of sentences from epic literature.
منابع مشابه
Extracting Formal Models from Normative Texts
Normative texts are documents based on the deontic notions of obligation, permission, and prohibition. Our goal is model such texts using the C-O Diagram formalism, making them amenable to formal analysis, in particular verifying that a text satisfies properties concerning causality of actions and timing constraints. We present an experimental, semi-automatic aid to bridge the gap between a nor...
متن کاملConverting Phrase Structures to Dependency Structures in Sanskrit
Two annotations schemes for presenting the parsed structures are prevalent viz. the constituency structure and the dependency structure. While the constituency trees mark the relations due to positions, the dependency relations mark the semantic dependencies. Free word order languages like Sanskrit pose more problems for constituency parses since the elements within a phrase are dislocated. In ...
متن کاملDetermining the Statistical Significance of Rules for Rule-based Knowledge-extraction Algorithms
Domain speci c knowledge bases are often built from domain-speci c texts using rule-based knowledge-retrieval algorithms. These algorithms are based on semantic extraction rules that process text using a parser, looking at the resulting parse trees & dependency graphs and then applying those rules to identify possible constructs for triple extraction. The performance of such algorithms critical...
متن کاملLearning Subgraph Patterns from text for Extracting Disease - Symptom Relationships
To some extent, texts can be represented in the form of graphs, such as dependency graphs in which nodes represent words and edges represent grammatical dependencies between words. Graph representation of texts is an interesting alternative to string representation because it provides an additional level of abstraction over the syntax that is sometime easier to compute. In this paper, we study ...
متن کاملCoarse Semantic Classification of Rare Nouns Using Cross-Lingual Data and Recurrent Neural Networks
The paper presents a method for WordNet supersense tagging of Sanskrit, an ancient Indian language with a corpus grown over four millenia. The proposed method merges lexical information from Sanskrit texts with lexicographic definitions from Sanskrit-English dictionaries, and compares the performance of two machine learning methods for this task. Evaluation concentrates on Vedic, the oldest lay...
متن کامل