Semantic Categorization of Contextual Features Based on Wordnet for G-to-P Conversion of Arabic Numerals Combined with Homographic Classifiers
نویسندگان
چکیده
Arabic numerals show a high occurrence-frequency and deliver significant senses, especially in scientific or informative texts. The problem, how to convert Arabic numerals to phonemes with ambiguous classifiers in Korean, is not easily resolved. In this paper, the ambiguities of Arabic numerals combined with homographic classifiers are analyzed and the resolutions for their sense disambiguation based on KorLex (Korean Lexico-Semantic Network) are proposed. Words proceeding or following the Arabic Numerals are categorized into 54 semantic classes based on the lexical hierarchy in KorLex 1.0. The semantic classes are trained to classify the meaning and the reading of Arabic Numerals using a decision tree. The proposed model shows 87.3% accuracy which is 14.1% higher than the baseline.
منابع مشابه
Disambiguation Based on Wordnet for Transliteration of Arabic Numerals for Korean TTS
Transliteration of Arabic numerals is not easily resolved. Arabic numerals occur frequently in scientific and informative texts and deliver significant meanings. Since readings of Arabic numerals depend largely on their context, generating accurate pronunciation of Arabic numerals is one of the critical criteria in evaluating TTS systems. In this paper, (1) contextual, pattern, and arithmetic f...
متن کاملAutomatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملAutomatic Recognition of Off-line Handwritten Arabic (Indian) Numerals Using Support Vector and Extreme Learning Machines
This paper describes a technique using Support Vector (SVM) and Extreme Learning Machines (ELM) for automatic recognition of off-line handwritten Arabic (Indian) numerals. The features of angle, distance, horizontal, and vertical span are extracted from these numerals. The database has 44 writers with 48 samples of each digit totaling 21120 samples. A two-stage exhaustive parameter estimation t...
متن کاملدستهبندی پرسشها با استفاده از ترکیب دستهبندها
Question answering systems are produced and developed to provide exact answers to the question posted in natural language. One of the most important parts of question answering systems is question classification. The purpose of question classification is predicting the kind of answer needed for the question in natural language. The literature works can be categorized as rule-based and learning...
متن کاملتشخیص آریتمی انقباضات زودرس بطنی در سیگنال الکتریکی قلب با استفاده ازترکیب طبقهبندها
Cardiovascular diseases are the most dangerous diseases and one of the biggest causes of fatality all over the world. One of the most common cardiac arrhythmias which has been considered by physicians is premature ventricular contraction (PVC) arrhythmia. Detecting this type of arrhythmia due to its abundance of all ages, is particularly important. ECG signal recording is a non-invasive, popula...
متن کامل