Experiments in cross-language medical information retrieval using a mixing translation module
نویسندگان
چکیده
Given the ever-increasing scale and diversity of medical literature widely published in English on the Internet, improving the performance of information retrieval by cross-language is an urgent research objective. Cross-language medical information retrieval (CLMIR) consists of providing a query in one language and searching medical document collections in one or more different languages. Our users of CLMIR are users who are able to read biomedical texts in English, but have difficulty formulating English queries. This paper proposes a French/English CLMIR system as a mixing model for supporting the retrieval of English medical documents. Methods fall into the category of query translation approach in which we use a hybrid machine translation that combines a pattern-based module with a rule-based translator and includes three steps from pre- to- post-translation. In parallel to this hybrid machine translation, we use multilingual UMLS Methasaurus as a complementary translator. The results show that using a mixing translation module outperforms machine translation-based method and thesaurus-based method used separately.
منابع مشابه
Ontologies in Cross-Language Information Retrieval
We present an approach to using ontologies as interlingua in cross-language information retrieval in the medical domain. Our approach is based on using the Unified Medical Language System (UMLS) as the primary ontology. Documents and queries are annotated with multiple layers of linguistic information (part-of-speech tags, lemmas, phrase chunks). Based on this we identify medical terms and sema...
متن کاملOntologies in Croos-Language Information Retrieval
We present an approach to using ontologies as interlingua in cross-language information retrieval in the medical domain. Our approach is based on using the Unified Medical Language System (UMLS) as the primary ontology. Documents and queries are annotated with multiple layers of linguistic information (part-of-speech tags, lemmas, phrase chunks). Based on this we identify medical terms and sema...
متن کاملUniversity of Hagen at CLEF 2005: Towards a Better Baseline for NLP Methods in Domain-Specific Information Retrieval
The third participation of the University of Hagen at the German Indexing and Retrieval Test (GIRT) task of the Cross Language Evaluation Campaign (CLEF 2005) aims at providing a better baseline for experiments with natural language processing (NLP) methods in domainspecific information retrieval (IR). Our monolingual experiments with the German document collection are based on a setup combinin...
متن کاملKECIR Question Answering System at NTCIR7 CCLQA
At the NTCIR-7 CCLQA (Complex Cross-Language Question Answering) task, we participated in the Chinese-Chinese (C-C) and English-Chinese (E-C) QA (Question Answering) subtasks. In this paper, we describe our QA system, which includes modules for question analysis, document retrieval, information extraction and answer generation. Besides, we used an online MT (Machine Translation) system to deal ...
متن کاملResolving Translation Ambiguity using Monolingual Corpora. A Report on Clairvoyance CLEF-2002 Experiments
Choosing the correct target words is a difficult problem for machine translation. In cross-language information retrieval, this problem of choice is mitigated since more than one translation alternative can be retained in the target query. Between choosing just one word as a translation and keeping all the possible translations for each source word, one can apply a range of filtering techniques...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Studies in health technology and informatics
دوره 107 Pt 2 شماره
صفحات -
تاریخ انتشار 2004