Discriminating Among Word Senses Using McQuitty's Similarity Analysis

نویسنده

  • Amruta Purandare
چکیده

This paper presents an unsupervised method for discriminating among the senses of a given target word based on the context in which it occurs. Instances of a word that occur in similar contexts are grouped together via McQuitty’s Similarity Analysis, an agglomerative clustering algorithm. The context in which a target word occurs is represented by surface lexical features such as unigrams, bigrams, and second order co-occurrences. This paper summarizes our approach, and describes the results of a preliminary evaluation we have carried out using data from the SENSEVAL-2 English lexical sample and the line corpus.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word net based Method for Determining Semantic Sentence Similarity through various Word Senses

Semantic similarity is a confidence score that replicates semantic equivalence between the meanings of two sentences. Determining the similarity among sentences is one of the critical tasks which have a wide-ranging impact in recent NLP applications. This paper presents a method for identifying semantic sentence similarity among sentences using semantic relation of word senses across the differ...

متن کامل

KSU KDD: Word Sense Induction by Clustering in Topic Space

We describe our language-independent unsupervised word sense induction system. This system only uses topic features to cluster different word senses in their global context topic space. Using unlabeled data, this system trains a latent Dirichlet allocation (LDA) topic model then uses it to infer the topics distribution of the test instances. By clustering these topics distributions in their top...

متن کامل

Word Sense Induction Using Lexical Chain based Hypergraph Model

Word Sense Induction is a task of automatically finding word senses from large scale texts. It is generally considered as an unsupervised clustering problem. This paper introduces a hypergraph model in which nodes represent instances of contexts where a target word occurs and hyperedges represent higher-order semantic relatedness among instances. A lexical chain based method is used for discove...

متن کامل

The Mental Representation of Polysemy across Word Classes

Experimental studies on polysemy have come to contradictory conclusions on whether words with multiple senses are stored as separate or shared mental representations. The present study examined the semantic relatedness and semantic similarity of literal and non-literal (metonymic and metaphorical) senses of three word classes: nouns, verbs, and adjectives. Two methods were used: a psycholinguis...

متن کامل

Merging Verb Senses of Hindi WordNet using Word Embeddings

In this paper, we present an approach for merging fine-grained verb senses of Hindi WordNet. Senses are merged based on gloss similarity score. We explore the use of word embeddings for gloss similarity computation and compare with various WordNet based gloss similarity measures. Our results indicate that word embeddings show significant improvement over WordNet based measures. Consequently, we...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003