Visually and Phonologically Similar Characters in Incorrect Simplified Chinese Words

نویسندگان

  • Chao-Lin Liu
  • Min-Hua Lai
  • Yi-Hsuan Chuang
  • Chia-Ying Lee
چکیده

Visually and phonologically similar characters are major contributing factors for errors in Chinese text. By defining appropriate similarity measures that consider extended Cangjie codes, we can identify visually similar characters within a fraction of a second. Relying on the pronunciation information noted for individual characters in Chinese lexicons, we can compute a list of characters that are phonologically similar to a given character. We collected 621 incorrect Chinese words reported on the Internet, and analyzed the causes of these errors. 83% of these errors were related to phonological similarity, and 48% of them were related to visual similarity between the involved characters. Generating the lists of phonologically and visually similar characters, our programs were able to contain more than 90% of the incorrect characters in the reported errors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Structural Information for Identifying Similar Chinese Characters

Chinese characters that are similar in their pronunciations or in their internal structures are useful for computer-assisted language learning and for psycholinguistic studies. Although it is possible for us to employ imagebased methods to identify visually similar characters, the resulting computational costs can be very high. We propose methods for identifying visually similar Chinese charact...

متن کامل

The Recognition Of Handwritten Chinese Characters From Paper Records

This paper describes a method used for the recognition of handwritten simplified Chinese characters from paper records. The method is based on the use of discrete hidden Markov models. The recognition accuracy achieved for all 3755 common simplified Chinese characters in GB1 is 91.2% for top 1 choice and 98.5% for top 5 choice. The method recognizes isolated characters only and not words or phr...

متن کامل

Representation of Linguistic Information Determines Its Susceptibility to Memory Interference

We used the dual-task paradigm to infer how linguistic information is represented in the brain by indexing its susceptibility to retrieval interference. We measured recognition memory, in bilingual Chinese-English, and monolingual English speakers. Participants were visually presented with simplified Chinese characters under full attention, and later asked to recognize them while simultaneously...

متن کامل

Attentional Blink Is Hierarchically Modulated by Phonological, Morphological, Semantic and Lexical Connections between Two Chinese Characters

The ability to identify the second of two targets (T2) is impaired if that target is presented less than ∼500 ms after the first (T1). This transient deficit is known as attentional blink (AB). Previous studies have suggested that the magnitude of the AB effect can be modulated by manipulating the allocation of attentional resources to T1 or T2. However, few experiments have used Chinese charac...

متن کامل

Phonological Activation in Visual Identification of Chinese Two-Character Words

Evidence for phonological activation in the recognition of 2-character Chinese words was discovered in 2 experiments. In a meaning-judgment task, Experiment 1 exposed two words with stimulus onset asynchronies (SOAs) of 0, 71, and 157 ms. At al13 SOAs, times to make a "no" meaning judgment were longer for words that were homophones than for unrelated controls. In a lexical-decision task, Experi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010