SRCB-WSD: Supervised Chinese Word Sense Disambiguation with Key Features
نویسنده
چکیده
This article describes the implementation of Word Sense Disambiguation system that participated in the SemEval-2007 multilingual Chinese-English lexical sample task. We adopted a supervised learning approach with Maximum Entropy classifier. The features used were neighboring words and their part-of-speech, as well as single words in the context, and other syntactic features based on shallow parsing. In addition, we used word category information of a Chinese thesaurus as features for verb disambiguation. For the task we participated in, we obtained precision of 0.716 in micro-average, which is the best among all participated systems.
منابع مشابه
Semi-supervised Clustering for Word Instances and Its Effect on Word Sense Disambiguation
We propose a supervised word sense disambiguation (WSD) system that uses features obtained from clustering results of word instances. Our approach is novel in that we employ semi-supervised clustering that controls the fluctuation of the centroid of a cluster, and we select seed instances by considering the frequency distribution of word senses and exclude outliers when we introduce “must-link”...
متن کاملTheme: A Study of Classifier Combination and Semi-Supervised Learning for Word Sense Disambiguation
1. Aims Word Sense Disambiguation (WSD) involves the association of a polysemous word in a text or discourse with a particular sense among numerous potential senses of that word. In my thesis, we present a study of classifier combination and semi-supervised learning for WSD, which aim to boost supervised WSD and improve accuracy of WSD. In addition, we also work on context representation and fe...
متن کاملSemi-Supervised Learning for Word Sense Disambiguation: Quality vs. Quantity
In this paper, we discuss the importance of the quality against the quantity of automatically extracted examples for word sense disambiguation (WSD). We first show that we can build a competitive WSD system with a memory-based classifier and a feature set reduced to easily and efficiently computable features. We then show that adding automatically annotated examples improves the performance of ...
متن کاملDisambiguation with Feature Selection and Semi - Supervised Learning ”
1. Objective Word Sense Disambiguation (WSD) is the task of determining the right sense of a polysemous word in a given context. This study aims to enhance the performance of supervised-based word sense determination by focusing on feature selection and using bootstrapping techniques. Senses determination of a word is essentially based on the information extracted from the context in which this...
متن کاملExploiting Parallel Texts for Word Sense Disambiguation: An Empirical Study
A central problem of word sense disambiguation (WSD) is the lack of manually sense-tagged data required for supervised learning. In this paper, we evaluate an approach to automatically acquire sensetagged training data from English-Chinese parallel corpora, which are then used for disambiguating the nouns in the SENSEVAL-2 English lexical sample task. Our investigation reveals that this method ...
متن کامل