EuroWordNet as a Resource for Cross-language Information Retrieval
نویسندگان
چکیده
One of the aims of EuroWordNet (EWN) was to provide a resource for Cross-Language Information Retrieval (CLIR). In this paper we present experiments to test the usefulness of EWN for this purpose via a formal evaluation using the Spanish queries from the TREC6 CLIR test set. All CLIR systems using bilingual dictionaries must find a way of dealing with multiple translations and we employ a word sense disambiguation algorithm for this purpose. Retrieval performance using when the disambiguation algorithm was used was 90% of that recorded using queries which had been disambiguated manually.
منابع مشابه
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-language Information Retrieval
One of the aims of EuroWordNet (EWN) was to provide a resource for Cross-Language Information Retrieval (CLIR). In this paper we present experiments which test the usefulness of EWN for this purpose via a formal evaluation using the Spanish queries from the TREC6 CLIR test set. All CLIR systems using bilingual dictionaries must find a way of dealing with multiple translations and we employ a WS...
متن کاملEuroWordNet: a multilingual database for information retrieval
The aim of the EuroWordNet-project is the development of a database with wordnets for English, Spanish, Dutch and Italian, similar to the Princeton WordNet1.5, which contains basic semantic relations between words in English. The Dutch, Italian and Spanish wordnets will be linked to the WordNet1.5 using equivalence relations. The resulting multilingual database can directly be used in (multi-li...
متن کاملUsing Eurowordnet in a Concept-Based Approach to Cross-Language Text Retrieval
We present an approach to cross± language text retrieval based on the EuroWordNet (EWN)multilingual semantic database. EuroWordNet is a multilingual,WordNet ± like database with basic semantic relations between words for several European languages (English, Dutch, Spanish, Italian, German, French, Czech, and Estonian). In addition to the relations in WordNet 1.5, EWN includes domain labels, cro...
متن کاملAn Efficient and Flexible Format for Linguistic and Semantic Annotation
The paper describes an XML annotation format and tool developed within the MUCHMORE project. The annotation scheme was designed specifically for the purposes of Cross-Lingual Information Retrieval in the medical domain so as to allow both efficient and flexible access to layers of information. We use a parallel English-German corpus of medical abstracts and annotate it with linguistic informati...
متن کاملEvaluating Wordnets in Cross-language Information Retrieval: the ITEM Search Engine
This paper presents the ITEM multilingual search engine. This search engine performs full lexical processing (morphological analysis, tagging and Word Sense Disambiguation) on documents and queries in order to provide language-neutral indexes for querying and retrieval. The indexing terms are the EuroWordNet/ITEM InterLingual Index records that link wordnets in 10 languages of the European Comm...
متن کامل