Extraction and Analysis of English Noun-Noun Compounds with Chinese-English Parallel Corpora

نویسندگان

  • Shao
  • Yan
چکیده

Noun-noun compound is a common type of multiword expression in English. It causes problems in natural language processing as many other kinds of MWEs. In this paper, we extract noun-noun compounds using their POS tags. Then the extracted nounnoun compounds are aligned to their Chinese translations using word alignment method. The statistical analysis of the alignments shows that English noun-noun compounds are translated differently in the parallel corpus. The number of word as well as the POS tags of the Chinese translations is different while the majority of noun-noun compounds are translated into single Chinese nouns.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Delexical HAVE/MAKE + Noun Collocations: A Comparison of Advanced Swedish and Chinese Learner English

Collocations, i.e., recurrent word combinations such as take advantage (of), strong tea, and deeply absorbed, are receiving increasing attention in SLA research because of their importance to “native-like” production of language. Previous studies suggest that collocations pose a serious challenge to language learners and that learners’ L1 plays a crucial role in this respect. My paper aims to s...

متن کامل

Disambiguation of Single Noun Translations Extracted from Bilingual Comparable Corpora

s of papers of four academic societies, namely Japan Architecture Society (JAS), Institute of Electric Engineering (IEE), Institute of Electronics and Communication Engineering (IECE), and Information Processing Society of Japan (IPSJ), published in Japan. Numbers of abstracts of each of these corpora are shown in Table 1. Parts of these bilingual corpora are parallel. The percentages of parall...

متن کامل

Singular or Plural? Exploiting Parallel Corpora for Chinese Number Prediction

We explore a novel approach to automatically predict noun number in Chinese by using a word-aligned Chinese-English parallel corpus. We first map number information from English onto Chinese to create a dataset labeled with a POS tagset enhanced with number information, and then train a model to automatically predict noun number using a combination of lexical and syntactic features. We evaluate...

متن کامل

Use of Articles in Learning English as a Foreign Language: A Study of Iranian English Undergraduates

The significance of error analysis for the learner, the teacher and the researcher is now widely recognized. Earlier studies of error analysis concentrated on intersystematic comparison of the “native language” and the “target language” and drew the required data largely from intuitions and impressionistic observations. This study was conducted on the basis of the following observations: (1) to...

متن کامل

English-Arabic Transliteration

Proper nouns may be considered as the most important query words in information retrieval. If the two languages use the same alphabet, the same proper nouns can be found in either language. However, if the two languages use different alphabets, the names must be transliterated. Short vowels are not usually marked on the Arabic words in almost all Arabic documents (except very important document...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013