What Types of Translations Hide in Wikipedia?

نویسندگان

  • Jonas Sjöbergh
  • Olof Sjöbergh
  • Kenji Araki
چکیده

We extend an automatically generated bilingual JapaneseSwedish dictionary with new translations, automatically discovered from the multi-lingual online encyclopedia Wikipedia. Over 50,000 translations, most of which are not present in the original dictionary, are generated, with very high translation quality. We analyze what types of translations can be generated by this simple method. The majority of the words are proper nouns, and other types of (usually) uninteresting translations are also generated. Not counting the less interesting words, about 15,000 new translations are still found. Checking against logs of search queries from the old dictionary shows that the new translations would significantly reduce the number of searches with no matching translation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applying Catford’s Category Shifts to the Persian Translations of Three English Romantic Poems

This research aimed at evaluating the types and frequency of category shifts in the Persian translations of English poems based on Catford’s model of shifts. To this end, three English romantic poems of A Histo- ry of English Literature, namely, Blake’s ‘The Chimney Sweeper’, Coleridge’s ‘Kubla Khan’, and Keats’ ‘To Autumn’ along with their Persian t...

متن کامل

RALI Experiments in IR4QA at NTCIR-7

In this report, we examine what information retrieval techniques can help identify documents that contain answers to different types of question. In particular, we exploit different external resource according to the type of question. In particular, Wikipedia will be exploited for identifying personal names and their translation, as well as biography-related keywords. Google search engine is us...

متن کامل

Supporting Multilingual Collaboration for Wikipedia Translations

In Wikipedia, the largest encyclopedia on the Internet, a huge amount of knowledge is shared among users. However, differences in the number of articles among different language versions of Wikipedia represent an important issue. In order to solve the current imbalance of knowledge present in different languages , some users translate existing articles from one language to create new articles i...

متن کامل

Query Translation for Cross-lingual Information Retrieval using Wikipedia

In this paper the system WikiTranslate is introduced that performs query translation for cross-lingual information retrieval (CLIR) that only uses Wikipedia. Queries will be mapped to Wikipedia concepts and the corresponding translations of these concepts in the target language are used to create the final query. WikiTranslate is evaluated by searching with topics in Dutch, French and Spanish i...

متن کامل

Lexical Enrichment of Biomedical Ontologies

aBstRact This chapter is concerned with lexical enrichment of ontologies, that is how to enrich a given ontology with lexical information derived from a semantic lexicon such as WordNet or other lexical resources. The authors present an approach towards the integration of both types of resources, in particular for the human anatomy domain as represented by the Foundational Model of Anatomy and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008