Effects of High-Order Co-occurrences on Word Semantic Similarities

نویسندگان

  • Benoît Lemaire
  • Guy Denhière
چکیده

A computational model of the construction of word meaning through exposure to texts is built in order to simulate the effects of co-occurrence values on word semantic similarities, paragraph by paragraph. Semantic similarity is here viewed as association. It turns out that the similarity between two words W1 and W2 strongly increases with a co-occurrence, decreases with the occurrence of W1 without W2 or W2 without W1, and slightly increases with high-order co-occurrences. Therefore, operationalizing similarity as a frequency of co-occurrence probably introduces a bias: first, there are cases in which there is similarity without co-occurrence and, second, the frequency of co-occurrence overestimates similarity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effects of High-Order Co-occurrences on Word Semantic Similarity

A computational model of the construction of word meaning through exposure to texts is built in order to simulate the effects of co-occurrence values on word semantic similarities, paragraph by paragraph. Semantic similarity is here viewed as association. It turns out that the similarity between two words W1 and W2 strongly increases with a co-occurrence, decreases with the occurrence of W1 wit...

متن کامل

A New Document Embedding Method for News Classification

Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...

متن کامل

Word clustering effect on vocabulary learning of EFL learners: A case of semantic versus phonological clustering

The aim of this study is to determine the effect of word clustering method on vocabulary learning of Iranian EFL learners through a case of semantic versus phonological clustering. To this effect, 80 homogeneous students from four intermediate classes at an English institute in Torbat e Heydariyeh participated in this research. They were assigned to four groups according to semantic versus phon...

متن کامل

Co-word maps and topic modeling: A comparison using small and medium-sized corpora (N < 1, 000)

Induced by “big data,” “topic modeling” has become an attractive alternative to mapping cowords in terms of co-occurrences and co-absences using network techniques. Does topic modeling provide an alternative for co-word mapping in research practices using moderately sized document collections? We return to the word/document matrix using first a single text with a strong argument (“The Leiden Ma...

متن کامل

Acquiring Word Similarities with Higher Order Association Mining

We present a novel approach to mine word similarity in Textual Case Based Reasoning. We exploit indirect associations of words, in addition to direct ones for estimating their similarity. If word A co-occurs with word B, we say A and B share a first order association between them. If A co-occurs with B in some documents, and B with C in some others, then A and C are said to share a second order...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/0804.0143  شماره 

صفحات  -

تاریخ انتشار 2006