Word Sense Disambiguation, Lexical Semantics, and NLP Applications
نویسنده
چکیده
Ide and Wilks argue that a fine-grained division of senses may not be an appropriate goal for a computational WSD task (Ide and Wilks, 2006). They propose that NLP needs correspond roughly to homograph-level distinctions, although they acknowledge some evidence of broad level distinctions within a homograph. They further argue that machine translation requires finer-grained distinctions than information retrieval. These arguments are problematic for the following reasons:
منابع مشابه
Towards a Seamless Integration of Word Senses into Downstream NLP Applications
Lexical ambiguity can impede NLP systems from accurate understanding of semantics. Despite its potential benefits, the integration of sense-level information into NLP systems has remained understudied. By incorporating a novel disambiguation algorithm into a state-of-the-art classification model, we create a pipeline to integrate sense-level information into downstream NLP applications. We show...
متن کاملUnsupervised and Minimally Supervised Learning of Lexical Semantics Proceedings of the Workshop
Supervised word sense disambiguation requires training corpora that have been tagged with word senses, and these word senses typically come from a pre-existing sense inventory. Space limitations imposed by dictionary publishers have biased the field towards lists of discrete senses for an individual lexeme. This approach does not capture information about relatedness of individual senses. How i...
متن کاملSemantic Clustering of Pivot Paraphrases
Paraphrases extracted from parallel corpora by the pivot method (Bannard and Callison-Burch, 2005) constitute a valuable resource for multilingual NLP applications. In this study, we analyse the semantics of unigram pivot paraphrases and use a graph-based sense induction approach to unveil hidden sense distinctions in the paraphrase sets. The comparison of the acquired senses to gold data from ...
متن کاملForming an Integrated Lexical Resource for Word Sense Disambiguation
This paper reports a full-scale linkage of noun senses between two existing lexical resources, namely WordNet and Roget's Thesaurus, to form an Integrated Lexical Resource (ILR) for use in natural language processing (NLP). The linkage was founded on a structurally-based sense-mapping algorithm. About 18,000 nouns with over 30,000 senses were mapped. Although exhaustive verification is impracti...
متن کاملWord Senses: The Stepping Stones in Semantic-Based Natural Language Processing
Most of the successful commercial applications in language processing (text and/or speech) dispense of any explicit concern on semantics, with the usual motivations stemming from the computational high costs required by dealing with semantics in case of large volumes of data. With recent advances in corpus linguistics and statistical-based methods in NLP, revealing useful semantic features of l...
متن کامل