A Step-wise Usage-based Method for Inducing Polysemy-aware Verb Classes
نویسندگان
چکیده
We present an unsupervised method for inducing verb classes from verb uses in gigaword corpora. Our method consists of two clustering steps: verb-specific semantic frames are first induced by clustering verb uses in a corpus and then verb classes are induced by clustering these frames. By taking this step-wise approach, we can not only generate verb classes based on a massive amount of verb uses in a scalable manner, but also deal with verb polysemy, which is bypassed by most of the previous studies on verb clustering. In our experiments, we acquire semantic frames and verb classes from two giga-word corpora, the larger comprising 20 billion words. The effectiveness of our approach is verified through quantitative evaluations based on polysemy-aware gold-standard data.
منابع مشابه
Clustering Polysemic Subcategorization Frame Distributions Semantically
Previous research has demonstrated the utility of clustering in inducing semantic verb classes from undisambiguated corpus data. We describe a new approach which involves clustering subcategorization frame (SCF) distributions using the Information Bottleneck and nearest neighbour methods. In contrast to previous work, we particularly focus on clustering polysemic verbs. A novel evaluation schem...
متن کاملSemi-automatic Induction of Systematic Polysemy from WordNet
This paper describes a semi-automatic method of inducing underspecified semantic classes from WordNet verbs and nouns. An underspecified semantic class is an abstract semantic class which encodes systematic polysem~f, a set of word senses that are related in systematic and predictable ways. We show the usefulness of the induced classes in the semantic interpretations and contextual inferences o...
متن کاملVerb polysemy and frequency effects in thematic fit modeling
While several data sets for evaluating thematic fit of verb-role-filler triples exist, they do not control for verb polysemy. Thus, it is unclear how verb polysemy affects human ratings of thematic fit and how best to model that. We present a new dataset of human ratings on high vs. low-polysemy verbs matched for verb frequency, together with high vs. low-frequency and well-fitting vs. poorly-f...
متن کاملA principled Cognitive Linguistics account of English phrasal verbs with up and out *
Many attempts have been made to discover some systematicity in the semantics of phrasal verbs. However, most research has investigated the semantics of particles exclusively; no study has examined how the multiple meanings of the verb also contribute to the meanings of phrasal verbs. The current corpus-based (COCA) study advances the research on phrasal verbs by examining the interaction of the...
متن کاملSynergetic Properties of Chinese Verb Valency
This paper analyses the 500 most frequent verbs in contemporary Chinese and investigates their synergetic properties. The results show that the rank-frequency distributions of both valency and polysemy abide by a power-law distribution and that valency and polysemy of these verbs abide by the Good distribution and the positive negative binomial distribution respectively. Statistical analysis in...
متن کامل