Automatic Acquisition of Domain Specific Lexicons
ثبت نشده
چکیده
In this paper we present the results of three years of experiments about automatic acquisition of domain specific terminology from corpora. We present an analysis of the potentiality and limitations of the Term Categorization approach to lexical acquisition, and we propose a novel methodology to approach the task, consisting on applying Latent Semantic Kernels to estimate term similarity. We find out that domain specific monosemous terms behave similarly to domain specific lexical items, so we used them to train and evaluate our Term Categorization system. Results show that the proposed technique is effective, achieving an accuracy of about 43%. We also reported an error analysis showing that most of the misclassification errors are related the the fuzzy nature of domain distinctions. In particular we identified a set of “families” in the WordNet Domains categories that makes difficult the classification task. Categorizing monosemous terms according to domain labels allows us to automatically assigning domain labels to a subset of the WordNet synsets, allowing to perform a bootstrap procedure to assign a of domain label to every synset in WordNet.
منابع مشابه
Semi-Automatic Acquisition of Domain-Specific Translation Lexicons
We investigate the utility of an algorithm for translation lexicon acquisition (SABLE), used previously on a very large corpus to acquire general translation lexicons, when that algorithm is applied to a much smaller corpus to produce candidates for domain-specific translation lexicons. 1 I n t r o d u c t i o n Reliable translation lexicons are useful in many applications, such as cross-langua...
متن کاملSemi-automatic Acquisition of Domain-speciic Translation Lexicons
We investigate the utility of an algorithm for translation lexicon acquisition (SABLE), used previously on a very large corpus to acquire general translation lexicons , when that algorithm is applied to a much smaller corpus to produce candidates for domain-speciic translation lexicons.
متن کاملAutomatic Acquisition of Meaning Elements for the Creation of Semantic Lexicons
This paper presents, in a unified way, two new trends in natural language processing, that is a new kind of lexicons that are cornerstones of a lot of current natural language applications which tackle the problem of meaning, and different corpus-based lexical knowledge acquisition studies that have emerged with the big amounts of electronic texts available on the nets. More precisely, this pap...
متن کاملSemi-Automatic Acquisition of Domain-Specific Translation Lexicons
We investigate the utility of an algorithm for translation lexicon acquisition (SABLE), used previously on a very large corpus to acquire general translation lexicons, when that algorithm is applied to a much smaller corpus to produce candidates for domain-specific translation lexicons. 1 I n t r o d u c t i o n Reliable translation lexicons are useful in many applications, such as cross-langua...
متن کاملAcquisition sur corpus d'informations lexicales fondées sur la sémantique différentielle
Semantic lexicons are an essential resource to let many natural language process-ing applications (automatic summarization, information retrieval, automatic transla-tion, etc.) penetrate the meaning of a text. The relevance of the information gathered bythose lexicons raises a problematic question: the meaning of a word likesoap, for ex-ample, varies considerably whether it ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005