Morphosemantic parsing of medical compound words: Transferring a French analyzer to English

نویسندگان

  • Louise Deléger
  • Fiammetta Namer
  • Pierre Zweigenbaum
چکیده

PURPOSE Medical language, as many technical languages, is rich with morphologically complex words, many of which take their roots in Greek and Latin--in which case they are called neoclassical compounds. Morphosemantic analysis can help generate definitions of such words. The similarity of structure of those compounds in several European languages has also been observed, which seems to indicate that a same linguistic analysis could be applied to neo-classical compounds from different languages with minor modifications. METHODS This paper reports work on the adaptation of a morphosemantic analyzer dedicated to French (DériF) to analyze English medical neo-classical compounds. It presents the principles of this transposition and its current performance. RESULTS The analyzer was tested on a set of 1299 compounds extracted from the WHO-ART terminology. 859 could be decomposed and defined, 675 of which successfully. CONCLUSION An advantage of this process is that complex linguistic analyses designed for French could be successfully transposed to the analysis of English medical neoclassical compounds, which confirmed our hypothesis of transferability. The fact that the method was successfully applied to a Germanic language such as English suggests that performances would be at least as high if experimenting with Romance languages such as Spanish. Finally, the resulting system can produce more complete analyses of English medical compounds than existing systems, including a hierarchical decomposition and semantic gloss of each word.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Defining Medical Words: Transposing Morphosemantic Analysis from French to English

Medical language, as many technical languages, is rich with morphologically complex words, many of which take their roots in Greek and Latin-in which case they are called neoclassical compounds. Morphosemantic analysis can help generate definitions of such words. This paper reports work on the adaptation of a morphosemantic analyzer dedicated to French (DériF) to analyze English medical neoclas...

متن کامل

Defining and relating biomedical terms: Towards a cross-language morphosemantics-based system

This paper addresses the issue of how semantic information can be automatically assigned to compound terms, i.e. both a definition and a set of semantic relations. This is particularly crucial when elaborating multilingual databases and when developing cross-language information retrieval systems. The paper shows how morphosemantics can contribute in the constitution of multilingual lexical net...

متن کامل

Character Stream Parsing of Mixed-lingual Text

In multilingual countries text-to-speech synthesis systems often have to deal with sentences containing inclusions of multiple other languages in form of phrases, words or even parts of words. Such sentences can only be correctly processed using a system that incorporates a mixed-lingual morphological and syntactic analyzer. A prerequisite for such an analyzer is the correct identification of w...

متن کامل

Munich-Edinburgh-Stuttgart Submissions of OSM Systems at WMT13

This paper describes Munich-EdinburghStuttgart’s submissions to the Eighth Workshop on Statistical Machine Translation. We report results of the translation tasks from German, Spanish, Czech and Russian into English and from English to German, Spanish, Czech, French and Russian. The systems described in this paper use OSM (Operation Sequence Model). We explain different pre-/post-processing ste...

متن کامل

Adapting an English Morphological Analyzer for French

A word-based morphological analyzer and a dictionary for recognizing inflected forms of French words have been built by adapting the UDICI" system. We describe the adaptations, emphasizing mechanisms developed to handle French verbs. This work lays the groundwork for doing French derivational morphology and morphology for other languages.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • International journal of medical informatics

دوره 78 Suppl 1  شماره 

صفحات  -

تاریخ انتشار 2009