A tool for detecting French - English cognates and false friends

نویسندگان

  • Oana FRUNZA
  • Diana INKPEN
چکیده

Résumé. Les congénères sont des mots qui ont au moins un sens en commun entre deux langues en plus d‘avoir une orthographie semblable. La reconnaissance de ce type de mots permet aux apprenants de langue seconde ou étrangère d‘enrichir plus rapidement leur vocabulaire et d‘améliorer leur compréhension écrite. Toutefois, les faux amis sont des paires de mots qui à l‘écrit ont des similarités, mais ils ont des significations différentes. Pour leur part, les congénères partiels sont des mots qui ont la même signification dans certains contextes dans chacune des deux langues. Cet article présente une méthode pour la classification automatique des paires des mots classées en congénères ou faux amis, en utilisant des mesures de similarité orthographiques et des méthodes d‘apprentissage automatique. Ainsi, nous construisons des listes complètes des congénères et des faux amis entre les deux langues. Nous désambiguisons les congénères partiels dans des contextes spécifiques. Nos méthodes sont évaluées pour le français et l‘anglais, mais elles seraient applicables à d‘autres paires des langues. Nous avons construit un outil qui prend ces listes et marque dans un texte français les mots qui ont des congénères ou des faux amis en anglais, dans le but d‘aider les apprenants en français langue seconde ou étrangère à améliorer leur compréhension écrite et à développer une meilleure rétention. Abstract. Cognates are pairs of words in different languages similar in spelling and meaning. They can help a second-language learner on the tasks of vocabulary expansion and reading comprehension. False friends are pairs of words that have similar spelling but different meanings. Partial cognates are pairs of words in two languages that have the same meaning in some, but not all contexts. In this article we present a method to automatically classify a pair of words as cognates or false friends, by using several measures of orthographic similarity as features for classification. We use this method to create complete lists of cognates and false friends between two languages. We also disambiguate partial cognates in context. We applied all our methods to French and English, but they can be applied to other pairs of languages as well. We built a tool that takes the produced lists and annotates a French text with equivalent English cognates or false friends, in order to help second-language learners improve their reading comprehension skills and retention rate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multilingual lexical resources to detect cognates in non-aligned texts

The identification of cognates between two distinct languages has recently started to attract the attention of NLP research, but there has been little research into using semantic evidence to detect cognates. The approach presented in this paper aims to detect English-French cognates within monolingual texts (texts that are not accompanied by aligned translated equivalents), by integrating word...

متن کامل

Disambiguation of partial cognates

Cognates – words that have similar spelling and meaning in two or more languages – can accelerate vocabulary acquisition and facilitate the reading comprehension task. A student has to pay attention to the pairs of words that look and sound similar but have different meanings – false-friend pairs, and especially to pairs of words that share meanings in some but not all contexts – partial cognat...

متن کامل

Detecting English-French Cognates Using Orthographic Edit Distance

Identification of cognates is an important component of computer assisted second language learning systems. We present a simple rule-based system to recognize cognates in English text from the perspective of the French language. At the core of our system is a novel similarity measure, orthographic edit distance, which incorporates orthographic information into string edit distance to compute th...

متن کامل

Using machine learning methods to avoid the pitfall of cognates and false friends in Spanish-Portuguese word pairs

The fact that 85% of the Portuguese lexicon contains Spanish cognates and that the linguistic structures of both languages are highly coincident is believed to be an advantage for the Spanish speaker who learns Portuguese. However, these similarities have some negative aspects in the learning of Portuguese, such as, the pitfall of false friends, since about 20% of cognates are false. The aim of...

متن کامل

Exploring cognition processes in second language acquisition: the case of cognates and false-friends in EST

This article explores one aspect of the processing perspective in L2 learning in an EST context: the processing of new content words, in English, of the type ‘cognates’ and ‘false friends’, by Spanish speaking engineering students. The paper does not try to offer a comprehensive overview of language acquisition mechanisms, but rather it is intended to review more narrowly how our conceptual sys...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007