Use of Sense Marking for Improving WordNet Coverage
نویسندگان
چکیده
WordNet is a crucial resource that aids in several Natural Language Processing (NLP) tasks. The WordNet development activity for 18 Indian languages has been initiated in INDIA by the IndoWordNet1 consortium using the expansion approach with the Hindi WordNet developed by IIT Bombay, as the source. After linking 20K synsets, it was decided that each of these languages should find the coverage of their respective language WordNets by using sense marker tool released by IIT Bombay. The sense marking activity mainly helped in validation of WordNet and improving the WordNet coverage. In this paper, the various effects that sense marking activity had on the Konkani2 language WordNet development are presented.
منابع مشابه
WordNet―Wikipedia―Wiktionary: Construction of a Three-way Alignment
The coverage and quality of conceptual information contained in lexical semantic resources is crucial for many tasks in natural language processing. Automatic alignment of complementary resources is one way of improving this coverage and quality; however, past attempts have always been between pairs of specific resources. In this paper we establish some set-theoretic conventions for describing ...
متن کاملAutomatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملIntroduction to Tools for IndoWordNet and Word Sense Disambiguation
Lexically rich resources form the foundation to all NLP tasks. Maintaining the high quality of resources is thus a high priority issue. In this paper we exhibit the tools developed at IIT Bombay, for the purpose of creation, enhancement and maintenance of the WordNets, as well as the ones used for NLP tasks that use WordNets directly, like Word Sense Disambiguation. The paper presents online an...
متن کاملSome Challenges of Automated Annotation in A Multilingual Scenario
A key ingredient of today’s NLP scenario is annotation and this paper discusses challenges involved in one of the toughest annotation tasks which is sense marking. A large amount of data needs to be sense marked accurately by human annotators in order to train the machine to understand the spoken languages. The sense marked corpus for various languages facilitate the task of Word Sense Disambig...
متن کاملConcept Space Synset Manager Tool
The IndoWordNet 1 Consortium consists of member institutions developing WordNet using the expansion approach. The WordNets developed using expansion approach are very much influenced by the source language and may not reflect the richness of the target language (Walawalikar et al., 2010). And therefore the IndoWordNet Community decided to develop concepts which were specific to their respective...
متن کامل