lexical similarity

Interpreting compound nouns with kernel methods

Journal: :Natural Language Engineering 2013

Diarmuid Ó Séaghdha Ann A. Copestake

This paper presents a classification-based approach to noun-noun compound interpretation within the statistical learning framework of kernel methods. In this framework, the primary modelling task is to define measures of similarity between data items, formalised as kernel functions. We consider the different sources of information that are useful for understanding compounds and proceed to defin...

متن کامل

SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity

2012

Eneko Agirre Daniel M. Cer Mona T. Diab Aitor Gonzalez-Agirre

Semantic Textual Similarity (STS) measures the degree of semantic equivalence between two texts. This paper presents the results of the STS pilot task in Semeval. The training data contained 2000 sentence pairs from previously existing paraphrase datasets and machine translation evaluation resources. The test data also comprised 2000 sentences pairs for those datasets, plus two surprise dataset...

متن کامل

Self-Organizing Lexical Feature Maps Semiotic Interpretation and Possible Application in Lexicography

2009

Semiotic interpretation of lexical cohesion is a major research challenge both in theoretical and applied linguistics. From the point of view of usagebased language description, individual lexical units can be roughly characterized by their collocation profiles, i.e., by collections of condensed usage patterns extracted from very large corpora. It is posited that related lexical units tend to s...

متن کامل

Comparison of the Baseline Knowledge-, Corpus-, and Web-based Similarity Measures for Semantic Relations Extraction

2011

Alexander Panchenko

Unsupervised methods of semantic relations extraction rely on a similarity measure between lexical units. Similarity measures differ both in kinds of information they use and in the ways how this information is transformed into a similarity score. This paper is making a step further in the evaluation of the available similarity measures within the context of semantic relation extraction. We com...

متن کامل

A Structural Similarity Measure

2006

Petr Homola Vladislav Kuboň

This paper outlines a measure of language similarity based on structural similarity of surface syntactic dependency trees. Unlike the more traditional string-based measures, this measure tries to reflect “deeper” correspondences among languages. The development of this measure has been inspired by the experience from MT of syntactically similar languages. This experience shows that the lexical ...

متن کامل

Inner speech slips exhibit lexical bias, but not the phonemic similarity effect.

Journal: :Cognition 2008

Gary M Oppenheim Gary S Dell

Inner speech, that little voice that people often hear inside their heads while thinking, is a form of mental imagery. The properties of inner speech errors can be used to investigate the nature of inner speech, just as overt slips are informative about overt speech production. Overt slips tend to create words (lexical bias) and involve similar exchanging phonemes (phonemic similarity effect). ...

متن کامل

An empirical study of semantic similarity in WordNet and Word2Vec

2017

Abram Handler

This thesis performs an empirical analysis of Word2Vec by comparing its output to WordNet, a well-known, human-curated lexical database. It finds that Word2Vec tends to uncover more of certain types of semantic relations than others – with Word2Vec returning more hypernyms, synonomyns and hyponyms than hyponyms or holonyms. It also shows the probability that neighbors separated by a given cosin...

متن کامل

Second-Order Cohesion

Journal: :Computational Intelligence 2000

Stefan Kaufmann

Similarity in contextual behavior between words is considered a source of “lexical cohesion,” which is otherwise hard to measure or quantify. Such contextual similarity is used by an implementation for text segmentation, the VecTile system, which uses precompiled vector representations of words to produce similarity curves over texts. The performance of this system is shown to improve over that...

متن کامل

PolyUCOMP: Combining Semantic Vectors with Skip bigrams for Semantic Textual Similarity

2012

Jian Xu Qin Lu Zhengzhong Liu

This paper presents the work of the Hong Kong Polytechnic University (PolyUCOMP) team which has participated in the Semantic Textual Similarity task of SemEval-2012. The PolyUCOMP system combines semantic vectors with skip bigrams to determine sentence similarity. The semantic vector is used to compute similarities between sentence pairs using the lexical database WordNet and the Wikipedia corp...

متن کامل

Topic Models for Meaning Similarity in Context

2010

Georgiana Dinu Mirella Lapata

Recent work on distributional methods for similarity focuses on using the context in which a target word occurs to derive context-sensitive similarity computations. In this paper we present a method for computing similarity which builds vector representations for words in context by modeling senses as latent variables in a large corpus. We apply this to the Lexical Substitution Task and we show...

متن کامل