Bringing together over- and under- represented languages: Linking WordNet to the SIL Semantic Domains
نویسندگان
چکیده
We have created an open-source mapping between the SIL’s semantic domains (used for rapid lexicon building and organization for under-resourced languages) and WordNet, the standard resource for lexical semantics in natural language processing. We show that the resources complement each other, and suggest ways in which the mapping can be improved even further. The semantic domains give more general domain and associative links, which wordnet still has few of, while wordnet gives explicit semantic relations between senses, which the domains lack.
منابع مشابه
Automatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملChinese WordNet Domains: Bootstrapping Chinese WordNet with Semantic Domain Labels
We bootstrapped Chinese WordNet with semantic domain labels of WordNet Domains for constructing a language resource called Chinese WordNet Domains. The bootstrapping methods work from three aspects: 1) Princeton WordNet alignment, 2) lexical semantic relations and 3) domain taxonomy mapping. Experimental results of our proposed bootstrapping based domain predication achieve satisfying effects. ...
متن کاملMapping and Structural Analysis of Multi-lingual Wordnets
In this paper, we present observations on structural properties of wordnets of three languages: English, Hindi, and Marathi. Hindi and Marathi, spoken widely in India, rank 5th and 14th respectively in the world in terms of the number of people speaking these languages. The observations suggest the existence of the ‘small world’ property in wordnets and also lend credence to the belief that the...
متن کاملLinking CoreNet to WordNet - Some Aspects and Interim Consideration
CoreNet, which is built on 2,937 semantic categories, is a multilingual lexico-semantic network aiming at bridging multiple languages/parts-of-speech for a variety of NLP applications. To foster its more widespread use, we have attempted to link semantic categories of CoreNet to Princeton WordNet. To ameliorate translation problems between CoreNet (mostly written in Korean) and English WordNet ...
متن کاملAutomatic Identification and Disambiguation of Concepts and Named Entities in the Multilingual Wikipedia
In this paper we present an automatic multilingual annotation of the Wikipedia dumps in two languages, with both word senses (i.e. concepts) and named entities. We use Babelfy 1.0, a state-of-the-art multilingual Word Sense Disambiguation and Entity Linking system. As its reference inventory, Babelfy draws upon BabelNet 3.0, a very large multilingual encyclopedic dictionary and semantic network...
متن کامل