Optimizing Classifier Performance in Word Sense Disambiguation by Redefining Sense Classes
نویسندگان
چکیده
Learning word sense classes has been shown to be useful in fine-grained word sense disambiguation [Kohomban and Lee, 2005]. However, the common choice for sense classes, WordNet lexicographer files, are not designed for machine learning based word sense disambiguation. In this work, we explore the use of clustering techniques in an effort to construct sense classes that are more suitable for word sense disambiguation end-task. Our results show that these classes can significantly improve classifier performance over the state of the art results of unrestricted word sense disambiguation.
منابع مشابه
Optimizing feature set for Chinese Word Sense Disambiguation
This article describes the implementation of I2R word sense disambiguation system (I2R −WSD) that participated in one senseval3 task: Chinese lexical sample task. Our core algorithm is a supervised Naive Bayes classifier. This classifier utilizes an optimal feature set, which is determined by maximizing the cross validated accuracy of NB classifier on training data. The optimal feature set incl...
متن کاملAugmented Mixture Models for Lexical Disambiguation
This paper investigates several augmented mixture models that are competitive alternatives to standard Bayesian models and prove to be very suitable to word sense disambiguation and related classification tasks. We present a new classification correction technique that successfully addresses the problem of under-estimation of infrequent classes in the training data. We show that the mixture mod...
متن کاملModeling Consensus: Classifier Combination for Word Sense Disambiguation
This paper demonstrates the substantial empirical success of classifier combination for the word sense disambiguation task. It investigates more than 10 classifier combination methods, including second order classifier stacking, over 6 major structurally different base classifiers (enhanced Naïve Bayes, cosine, Bayes Ratio, decision lists, transformationbased learning and maximum variance boost...
متن کاملTrajectory Based Word Sense Disambiguation
Classifier combination is a promising way to improve performance of word sense disambiguation. We propose a new combinational method in this paper. We first construct a series of Naïve Bayesian classifiers along a sequence of orderly varying sized windows of context, and perform sense selection for both training samples and test samples using these classifiers. We thus get a sense selection tra...
متن کاملAutomatic Word Sense Disambiguation (wsd) System
This paper presents an automatic word sense disambiguation (WSD) system that uses Part-of-Speech (POS) tags along with word classes as the discrete features. Word Classes are derived from the Word Class Assigner using the Word Exchange Algorithm from statistical language processing. Naïve-Bayes classifier is employed from Weka in both the training and testing phases to perform the supervised le...
متن کامل