Part of Speech Tagging and Local Word Grouping Techniques for Natural Language Parsing in Hindi 1
نویسندگان
چکیده
We present an algorithm for local word grouping to extricate fixed word order dependencies in Hindi sentences. Local word grouping is achieved by defining regular expressions for the word groups. Ambiguities occurring during word grouping are also resolved. Hindi being a free order language, fixed order word group extraction is essential for decreasing the load on the free word order parser. The parser paradigm being used is the computational Paninian model. Also, local word grouping achieved can be used to provide inputs to intonation and prosody modelling units for text to speech systems in Indian languages. Part of speech tagging is an essential requirement for local word grouping. We present another algorithm for part of speech tagging based on lexical sequence constraints in Hindi. The algorithm acts as the first level of part of speech tagger, using constraint propagation, based on ontological information and information from morphological analysis, and lexical rules.
منابع مشابه
An improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملParts Of Speech Tagging for Indian Languages: A Literature Survey
Part of speech (POS) tagging is the process of assigning the part of speech tag or other lexical class marker to each and every word in a sentence. In many Natural Language Processing applications such as word sense disambiguation, information retrieval, information processing, parsing, question answering, and machine translation, POS tagging is considered as the one of the basic necessary tool...
متن کاملPart-of-Speech Tagging System for Indian Social Media Text on Twitter
Automatic part-of-speech (POS henceforth) is the primary necessities for any kind of Natural Language Processing (NLP) applications like disambiguate homonyms, text-to-speech processing, information retrieval, natural language parsing, information extraction etc. Here in this paper we are concentrating on POS tagging systems for Hindi and Bengali tweets. Although automatic POS tagging is a well...
متن کاملبررسی مقایسهای تأثیر برچسبزنی مقولات دستوری بر تجزیه در پردازش خودکار زبان فارسی
In this paper, the role of Part-of-Speech (POS) tagging for parsing in automatic processing of the Persian language is studied. To this end, the impact of the quality of POS tagging as well as the impact of the quantity of information available in the POS tags on parsing are studied. To reach the goals, three parsing scenarios are proposed and compared. In the first scenario, the parser assigns...
متن کاملRule Based Hindi Part of Speech Tagger
Part of Speech Tagger is an important tool that is used to develop language translator and information extraction. The problem of tagging in natural language processing is to find a way to tag every word in a sentence. In this paper, we present a Rule Based Part of Speech Tagger for Hindi. Our System is evaluated over a corpus of 26,149 words with 30 different standard part of speech tags for H...
متن کامل