Experiments in Detection and Correction of Russian Malapropisms by Means of the Web

نویسندگان

  • Elena Bolshakova
  • Igor Bolshakov
  • Alexey Kotlyarov
چکیده

Malapropism is a semantic error that is hardly detectable because it usually retains syntactical links between words in the sentence but replaces one content word by a similar word with quite different meaning. A method of automatic detection of malapropisms is described, based on Web statistics and a specially defined Semantic Compatibility Index (SCI). For correction of the detected errors, special dictionaries and heuristic rules are proposed, which retains only a few highly SCI-ranked correction candidates for the user’s selection. Experiments on Web-assisted detection and correction of Russian malapropisms are reported, demonstrating efficacy of the described method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detection and Correction of Malapropisms in Spanish by Means of Internet Search

Malapropisms are real-word errors that lead to syntactically correct but semantically implausible text. We report an experiment on detection and correction of Spanish malapropisms. Malapropos words semantically destroy collocations (syntactically connected word pairs) they are in. Thus we detect possible malapropisms as words that do not form semantically plausible collocations with neighboring...

متن کامل

On Correction of Semantic Errors in Natural Language Texts with a Dictionary of Literal Paronyms

Due to the open nature of the Web, search engines must include means of meaningful processing of incorrect texts, including automatic error detection and correction. One of wide-spread types of errors in Internet texts are malapropisms, i.e., semantic errors replacing a word by another existing word similar in letter composition and/or sound but semantically incompatible with the context. Metho...

متن کامل

An Experiment in Detection and Correction of Malapropisms Through the Web

Malapropism is a type of semantic errors. It replaces one content word by another content word similar in sound but semantically incompatible with the context and thus destructing text cohesion. We propose to signal a malapropism when a pair of syntactically linked content words in a text exhibits the value of a specially defined Semantic Compatibility Index (SCI) lower than a predetermined thr...

متن کامل

Malapropisms Detection and Correction using a Paronyms Dictionary, a Search Engine and Wordnet

This paper presents a method for the automatic detection and correction of malapropism errors found in documents using the WordNet lexical database, a search engine (Google) and a paronyms dictionary. The malapropisms detection is based on the evaluation of the cohesion of the local context using the search engine, while the correction is done using the whole text cohesion evaluated in terms of...

متن کامل

On Detection of Malapropisms by Multistage Collocation Testing

Malapropism is a (real-word) error in a text consisting in unintended replacement of one content word by another existing content word similar in sound but semantically incompatible with the context and thus destructing text cohesion, e.g.: they travel around the word. We present an algorithm of malapropism detection and correction based on evaluating the cohesion. As a measure of semantic comp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006