On Detection of Malapropisms by Multistage Collocation Testing
نویسندگان
چکیده
Malapropism is a (real-word) error in a text consisting in unintended replacement of one content word by another existing content word similar in sound but semantically incompatible with the context and thus destructing text cohesion, e.g.: they travel around the word. We present an algorithm of malapropism detection and correction based on evaluating the cohesion. As a measure of semantic compatibility of words we consider their ability to form syntactically linked and semantically admissible word combinations (collocations), e.g: travel (around the) world. With this, text cohesion at a content word is measured as the number of collocations it forms with the words in its immediate context. We detect malapropisms as words forming no collocations in the context. To test whether two words can form a collocation, we consider two types of resources: a collocation DB and an Internet search engine, e.g., Google. We illustrate the proposed method by classifying, tracing, and evaluating several English malapropisms.
منابع مشابه
Detection and Correction of Malapropisms in Spanish by Means of Internet Search
Malapropisms are real-word errors that lead to syntactically correct but semantically implausible text. We report an experiment on detection and correction of Spanish malapropisms. Malapropos words semantically destroy collocations (syntactically connected word pairs) they are in. Thus we detect possible malapropisms as words that do not form semantically plausible collocations with neighboring...
متن کاملMalapropisms Detection and Correction using a Paronyms Dictionary, a Search Engine and Wordnet
This paper presents a method for the automatic detection and correction of malapropism errors found in documents using the WordNet lexical database, a search engine (Google) and a paronyms dictionary. The malapropisms detection is based on the evaluation of the cohesion of the local context using the search engine, while the correction is done using the whole text cohesion evaluated in terms of...
متن کاملExperiments in Detection and Correction of Russian Malapropisms by Means of the Web
Malapropism is a semantic error that is hardly detectable because it usually retains syntactical links between words in the sentence but replaces one content word by a similar word with quite different meaning. A method of automatic detection of malapropisms is described, based on Web statistics and a specially defined Semantic Compatibility Index (SCI). For correction of the detected errors, s...
متن کاملWiener Chaos Versus Stochastic Collocation Methods for Linear Advection-Diffusion-Reaction Equations with Multiplicative White Noise
We compare Wiener chaos and stochastic collocation methods for linear advectionreaction-diffusion equations with multiplicative white noise. Both methods are constructed based on a recursive multistage algorithm for long-time integration. We derive error estimates for both methods and compare their numerical performance. Numerical results confirm that the recursive multistage stochastic colloca...
متن کاملThe Effect of Lexical Collocational Density on the Iranian EFL Learners’ Reading Comprehension
The present study aims at investigating the effect of different levels of lexical collocational density on EFL learners’ reading comprehension. Eighty sophomore students with different levels of proficiency studying at Zand Institute of Higher Education in Shiraz, Iran were chosen from among eighty five learners based on their score distribution on a reduced TOEFL test constructed by Education...
متن کامل