Integrating General-purpose and Corpus-based Verb Classification
نویسندگان
چکیده
A long-standing debate in the computational linguistic communi ty is about the generality of lexical taxonomies. Many linguists (Nirenburg 1995; Hirst 1995) stress that taxonomies that are not language neutral, at least at the intermediate and high level, have little hope of success. On the other hand, lexicon builders who have experience of designing taxonomies for real applications claim that in sublanguages there exist very domain-dependent similarity relations. Given our experience and results, we are inclined to take the second position, but we are indeed sensitive to the theoretical motivations of the first. The problem is that the similarity relations suggested by the thematic structures of words 1 in sentences are highly domain dependent, and it is difficult, though perhaps not impossible, to find common invariants across sublanguages when this model of word similarity is adopted. On the other hand, conceptual, or compositional models of similarity are much more difficult to understand and formalize on a systematic basis, because of the difficulty of defining a commonly agreed upon set of semantic primitives into which words may be decomposed. It may be possible, however, and highly interesting, to integrate the results of a purely inductive method, such as the conceptual clustering system CIAULA (Basili, Pazienza, and Velardi 1993c, 1996a), and a hand-encoded, domain-general classification, such as, for example, WordNet. The purpose of one such experiment, which we describe in this paper, is to find some points of contact between psychologically motivated models, as WordNet, and data-driven models, as CIAULA. 2
منابع مشابه
A Corpus-based Analysis of Collocational Errors in the Iranian EFL Learners' Oral Production
Collocations are one of the areas generally considered problematic for EFL learners. Iranian learners of English like other EFL learners face various problems in producing oral collocations. An analysis of learners' spoken interlanguage both indicates the scope of the problem and the necessity to spend more time and energy by learners on mastering collocations. The present study specifically f...
متن کاملFunctional analysis of Subject and Verb in Theses Abstracts on Applied Linguistics
The purpose of the present study is to analyse abstracts related to Applied Linguistics, and more precisely the discourse functions of grammatical subjects and verbs. The corpus consisted of 50 PhD thesis abstracts written on the subject of Applied Linguistics. All of the abstracts were written from 2010 to 2014. The theses from which the abstracts were extracted are available in the ProQuest d...
متن کاملFunctional analysis of Subject and Verb in Theses Abstracts on Applied Linguistics
The purpose of the present study is to analyse abstracts related to Applied Linguistics, and more precisely the discourse functions of grammatical subjects and verbs. The corpus consisted of 50 PhD thesis abstracts written on the subject of Applied Linguistics. All of the abstracts were written from 2010 to 2014. The theses from which the abstracts were extracted are available in the ProQuest d...
متن کاملAn empirical classification of verbs based on Semantic Types: the case of the 'poison' verbs
This article proposes a new approach to verb classification based on Semantic Types selected in corpus-based verb patterns. This work !"#$%& '(& )#(*%+%& ,-.'"/& '0& 1'"2%& #(!& 3xploitations (Hanks 2013) and applies Corpus Pattern Analysis to a subset of verbs from Lev4(+%&56'4%'(+&78#%%9&4(78:!4(;&<."=%&%:7-&#%& hang and stab. These patterns are taken from the Pattern Dictionary of English Ve...
متن کاملUsing automatically learnt verb selectional preferences for classification of biomedical terms
In this paper, we present an approach to term classification based on verb selectional patterns (VSPs), where such a pattern is defined as a set of semantic classes that could be used in combination with a given domain-specific verb. VSPs have been automatically learnt based on the information found in a corpus and an ontology in the biomedical domain. Prior to the learning phase, the corpus is...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Linguistics
دوره 22 شماره
صفحات -
تاریخ انتشار 1996