Outta Control: Laws of Semantic Change and Inherent Biases in Word Representation Models
نویسندگان
چکیده
This article evaluates three proposed laws of semantic change. Our claim is that in order to validate a putative law of semantic change, the effect should be observed in the genuine condition but absent or reduced in a suitably matched control condition, in which no change can possibly have taken place. Our analysis shows that the effects reported in recent literature must be substantially revised: (i) the proposed negative correlation between meaning change and word frequency is shown to be largely an artefact of the models of word representation used; (ii) the proposed negative correlation between meaning change and prototypicality is shown to be much weaker than what has been claimed in prior art; and (iii) the proposed positive correlation between meaning change and polysemy is largely an artefact of word frequency. These empirical observations are corroborated by analytical proofs that show that count representations introduce an inherent dependence on word frequency, and thus word frequency cannot be evaluated as an independent factor with these representations.
منابع مشابه
A Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملWord Type Effects on L2 Word Retrieval and Learning: Homonym versus Synonym Vocabulary Instruction
The purpose of this study was twofold: (a) to assess the retention of two word types (synonyms and homonyms) in the short term memory, and (b) to investigate the effect of these word types on word learning by asking learners to learn their Persian meanings. A total of 73 Iranian language learners studying English translation participated in the study. For the first purpose, 36 freshmen from an ...
متن کاملThe evolution of the meaning of the word nurse based on the classical texts of Persian literature
Background and Aim: The semantic evolution of a word over time is inevitable, indicating a social, political, religious or cultural process. Nurse is one of the words that has a significant presence in Persian literature texts and has been used in many different meanings such as slave, servan, maid, devotee, obedient, patient and preserver. The purpose of this study is to show its semantic ev...
متن کاملNamed Entity Recognition in Persian Text using Deep Learning
Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...
متن کاملDiachronic Word Embeddings Reveal Statistical Laws of Semantic Change
Understanding how words change their meanings over time is key to models of language and cultural evolution, but historical data on meaning is scarce, making theories hard to develop and test. Word embeddings show promise as a diachronic tool, but have not been carefully evaluated. We develop a robust methodology for quantifying semantic change by evaluating word embeddings (PPMI, SVD, word2vec...
متن کامل