Bringing together over- and under- represented languages: Linking WordNet to the SIL Semantic Domains

نویسندگان

  • Muhammad Zulhelmy Bin Mohd Rosman
  • Frantisek Kratochvil
  • Francis Bond
چکیده

We have created an open-source mapping between the SIL’s semantic domains (used for rapid lexicon building and organization for under-resourced languages) and WordNet, the standard resource for lexical semantics in natural language processing. We show that the resources complement each other, and suggest ways in which the mapping can be improved even further. The semantic domains give more general domain and associative links, which wordnet still has few of, while wordnet gives explicit semantic relations between senses, which the domains lack.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

Chinese WordNet Domains: Bootstrapping Chinese WordNet with Semantic Domain Labels

We bootstrapped Chinese WordNet with semantic domain labels of WordNet Domains for constructing a language resource called Chinese WordNet Domains. The bootstrapping methods work from three aspects: 1) Princeton WordNet alignment, 2) lexical semantic relations and 3) domain taxonomy mapping. Experimental results of our proposed bootstrapping based domain predication achieve satisfying effects. ...

متن کامل

Mapping and Structural Analysis of Multi-lingual Wordnets

In this paper, we present observations on structural properties of wordnets of three languages: English, Hindi, and Marathi. Hindi and Marathi, spoken widely in India, rank 5th and 14th respectively in the world in terms of the number of people speaking these languages. The observations suggest the existence of the ‘small world’ property in wordnets and also lend credence to the belief that the...

متن کامل

Linking CoreNet to WordNet - Some Aspects and Interim Consideration

CoreNet, which is built on 2,937 semantic categories, is a multilingual lexico-semantic network aiming at bridging multiple languages/parts-of-speech for a variety of NLP applications. To foster its more widespread use, we have attempted to link semantic categories of CoreNet to Princeton WordNet. To ameliorate translation problems between CoreNet (mostly written in Korean) and English WordNet ...

متن کامل

Automatic Identification and Disambiguation of Concepts and Named Entities in the Multilingual Wikipedia

In this paper we present an automatic multilingual annotation of the Wikipedia dumps in two languages, with both word senses (i.e. concepts) and named entities. We use Babelfy 1.0, a state-of-the-art multilingual Word Sense Disambiguation and Entity Linking system. As its reference inventory, Babelfy draws upon BabelNet 3.0, a very large multilingual encyclopedic dictionary and semantic network...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014