Eating Your Own Cooking: Automatically Linking Wordnet Synsets of Two Languages

نویسندگان

  • Salil Joshi
  • Arindam Chatterjee
  • Arun Karthikeyan Karra
  • Pushpak Bhattacharyya
چکیده

Linked wordnets are invaluable linked lexical resources. Wordnet linking involves matching a particular synset (concept) in one wordnet to a synset in another wordnet. We have developed an automatic wordnet linking system that is divided into a number of stages. Starting with a synset in the first language (also referred to as the source language), our algorithm generates a list of candidate synsets in the second language (also referred to as the target language). In consecutive stages, a heuristic is used to prune and rank this list. The winner synset is then chosen as the linkage for the source synset. The candidate synsets are generated using a bilingual dictionary (BiDict). Further, the earlier heuristics which we developed used BiDict to rank these candidate synsets. However, development of a BiDict is cumbersome and requires human labor. Furthermore, in several cases sparsity of the BiDict handicaps the ranking algorithm to a great extent. We have thus devised heuristics to eliminate the requirement of BiDict during the ranking process by using the already linked synsets. Once sufficient number of linked synsets are available, these heuristics outperform our heuristics which use a BiDict. These heuristics are based on observations made from linking techniques applied by lexicographers. Our wordnet linking system can be used for any pair of languages, given either a BiDict or sufficient number of already linked synsets. The interface of the system is easy to comprehend and use. In this paper, we present this interface along with the developed heuristics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linking ImageNet WordNet Synsets with Wikidata

The linkage of ImageNet WordNet synsets to Wikidata items will leverage deep learning algorithm with access to a rich multilingual knowledge graph. Here I will describe our ongoing efforts in linking the two resources and issues faced in matching the Wikidata and WordNet knowledge graphs. I show an example on how the linkage can be used in a deep learning setting with real-time image classifica...

متن کامل

Mapping and Structural Analysis of Multi-lingual Wordnets

In this paper, we present observations on structural properties of wordnets of three languages: English, Hindi, and Marathi. Hindi and Marathi, spoken widely in India, rank 5th and 14th respectively in the world in terms of the number of people speaking these languages. The observations suggest the existence of the ‘small world’ property in wordnets and also lend credence to the belief that the...

متن کامل

Problems and Procedures to Make Wordnet Data (Retro)Fit for a Multilingual Dictionary

The data compiled through many Wordnet projects can be a rich source of seed information for a multilingual dictionary. However, the original Princeton WordNet was not intended as a dictionary per se, and spawning other languages from it introduces inherent ambiguity that confounds precise inter-lingual linking. This paper discusses a new presentation of existing Wordnet data that displays join...

متن کامل

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

Improving the Precision of Synset Links Between Cornetto and Princeton WordNet

Knowledge-based multilingual language processing benefits from having access to correctly established relations between semantic lexicons, such as the links between different WordNets. WordNet linking is a process that can be sped up by the use of computational techniques. Manual evaluations of the partly automatically established synonym set (synset) relations between Dutch and English in Corn...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012