Different Techniques Implemented in Gurumukhi Word Sense Disambiguation

نویسندگان

Himdweep Walia

Ajay Rana

Vineet Kansal

چکیده

One of the most important issues in the field of Natural Language Engineering is Word Sense Disambiguation (WSD).Gurumukhi or more commonly known as Punjabi, is world’s 12th most widely spoken language and this language is morphologically rich. But surprisingly, there are relatively less efforts in the field of computerization and development of lexical resources of this language. It is therefore motivating to develop a corpus of Punjabi Language that will help in tagging the sense of the words.The availability of sense tagged corpora contribute a lot in advances in WSD. Most accurate WSD systems use supervised learning algorithm to learn contextual rules or classification models automatically from sense-annotated examples, like Naïve Bayes, k-NN and Support Vector Machine (SVM) classifiers have shown high accuracy in WSD. The majority of work on WSD is focused on English and other European languages and standard test corpora are available for these languages. The lack of such standards put a major hindrance on WSD research for Punjabi and other Regional Indian languages. Thus, this defines the objective of this survey.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Domain Information for Word Sense Disambiguation

The major goal in ITC-irst's participation at SENSEVAL-2 was to test the role of domain information in word sense disambiguation. The underlying working hypothesis is that domain labels, such as MEDICINE, ARCHITECTURE and SPORT provide a natural way to establish semantic relations among word senses, which can be profitably used during the disambiguation process. For each task in which we partic...

متن کامل

رفع ابهام معنایی واژگان مبهم فارسی با مدل موضوعی LDA

Word sense disambiguation is the task of identifying the correct sense for the word in a given context among a finite set of possible sense. In this paper a model for farsi word sense disambiguation is presented. The model use two group of features: first, all word and stop words around target word and topic models as second features. We extract topics from a farsi corpus with Latent Dirichlet ...

متن کامل

Combining Weak Knowledge Sources for Sense Disambiguation

There has been a tradition of combining different knowledge sources in Artificial Intelligence research. We apply this methodology to word sense disambiguation (WSD), a long-standing problem in Computational Linguistics. We report on an implemented sense tagger which uses a machine readable dictionary to provide both a set of senses and associated forms of information on which to base disambigu...

متن کامل

DKPro WSD: A Generalized UIMA-based Framework for Word Sense Disambiguation

Implementations of word sense disambiguation (WSD) algorithms tend to be tied to a particular test corpus format and sense inventory. This makes it difficult to test their performance on new data sets, or to compare them against past algorithms implemented for different data sets. In this paper we present DKPro WSD, a freely licensed, general-purpose framework for WSD which is both modular and ...

متن کامل