Extraction and Analysis of English Noun-Noun Compounds with Chinese-English Parallel Corpora
نویسندگان
چکیده
Noun-noun compound is a common type of multiword expression in English. It causes problems in natural language processing as many other kinds of MWEs. In this paper, we extract noun-noun compounds using their POS tags. Then the extracted nounnoun compounds are aligned to their Chinese translations using word alignment method. The statistical analysis of the alignments shows that English noun-noun compounds are translated differently in the parallel corpus. The number of word as well as the POS tags of the Chinese translations is different while the majority of noun-noun compounds are translated into single Chinese nouns.
منابع مشابه
Delexical HAVE/MAKE + Noun Collocations: A Comparison of Advanced Swedish and Chinese Learner English
Collocations, i.e., recurrent word combinations such as take advantage (of), strong tea, and deeply absorbed, are receiving increasing attention in SLA research because of their importance to “native-like” production of language. Previous studies suggest that collocations pose a serious challenge to language learners and that learners’ L1 plays a crucial role in this respect. My paper aims to s...
متن کاملDisambiguation of Single Noun Translations Extracted from Bilingual Comparable Corpora
s of papers of four academic societies, namely Japan Architecture Society (JAS), Institute of Electric Engineering (IEE), Institute of Electronics and Communication Engineering (IECE), and Information Processing Society of Japan (IPSJ), published in Japan. Numbers of abstracts of each of these corpora are shown in Table 1. Parts of these bilingual corpora are parallel. The percentages of parall...
متن کاملSingular or Plural? Exploiting Parallel Corpora for Chinese Number Prediction
We explore a novel approach to automatically predict noun number in Chinese by using a word-aligned Chinese-English parallel corpus. We first map number information from English onto Chinese to create a dataset labeled with a POS tagset enhanced with number information, and then train a model to automatically predict noun number using a combination of lexical and syntactic features. We evaluate...
متن کاملUse of Articles in Learning English as a Foreign Language: A Study of Iranian English Undergraduates
The significance of error analysis for the learner, the teacher and the researcher is now widely recognized. Earlier studies of error analysis concentrated on intersystematic comparison of the “native language” and the “target language” and drew the required data largely from intuitions and impressionistic observations. This study was conducted on the basis of the following observations: (1) to...
متن کاملEnglish-Arabic Transliteration
Proper nouns may be considered as the most important query words in information retrieval. If the two languages use the same alphabet, the same proper nouns can be found in either language. However, if the two languages use different alphabets, the names must be transliterated. Short vowels are not usually marked on the Arabic words in almost all Arabic documents (except very important document...
متن کامل