Identification and Classification of Proper Nouns in Chinese Texts
نویسندگان
چکیده
Various strategies are proposed to identify and classify three types of proper nouns in Chinese texts. Clues from character, sentence and paragraph levels are employed to resolve Chinese personal names. Character, Syllable and Frequency Conditions are presented to treat transliterated personal names, To deal with organization names, keywords, prefix, word association and parts-of-speech are applied. For fair evaluation, large scale test data are selected from six sections of a newspaper. The precision and the recall for these three types are (88.04%, 92.56%), (50.62%, 71.93%) and (61.79%, 54.50%), respectively. When the former two types are regarded as a category, the performance becomes (81.46%, 91.22%). Compared with other approaches, our approach has better performance and our classification is automatic.
منابع مشابه
Proper name knowledge acquisition for text understanding
Current work in proper name analysis is focused on identification and limited categorisation of names. Some research has been carried out in acquiring knowledge of proper names from the contextual information within texts. In this study, we investigate how to transform human-oriented compilations, which contain a rich knowledge of proper names, into formallyrepresented knowledge for computer co...
متن کاملSemantic Classification of Chinese Unknown Words
This paper describes a classifier that assigns semantic thesaurus categories to unknown Chinese words (words not already in the CiLin thesaurus and the Chinese Electronic Dictionary, but in the Sinica Corpus). The focus of the paper differs in two ways from previous research in this particular area. Prior research in Chinese unknown words mostly focused on proper nouns (Lee 1993, Lee, Lee and C...
متن کاملNamed Entity Recognition in Assamese
Named Entity Recognition is a process through which a program extracts proper nouns in texts and associates them with a proper tag. NER has made significant progress in European languages, but in Indian languages due to the lack of effort as well as proper resources, it remains a challenging task. Recognizing ambiguities and assigning the correct tags to the names is the main goal of NER. Thus ...
متن کاملCategorization And Standardizing Proper Nouns For Efficient Information Retrieval
In this paper, we describe the most recent implementation and evaluation of the proper noun categorization and standardization module of the DRLINK document detection system being developed at Syracuse University, under the auspices of ARPA's TIPSTER program. We also discuss the expansion of group common nouns and group proper nouns to enhance retrieval recall. Successful proper noun boundary i...
متن کاملTranslation Quality Assessment of English Equivalents of Persian Proper Nouns: A case of bilingual tourist signposts in Isfahan
Abstract This study evaluated the translation quality of English equivalents of Persian proper nouns in the tourist signs and bilingual boards in Isfahan. To find different errors in the translations of the bilingual boards and tourist signs, the data were collected directly by taking picture or writing exactly from the available tourist signs and bilingual boards. Then, the errors were assesse...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996