Dgfs-cl Comparing Computational Models of Selectional Preferences – Second-order Co-occurrence vs. Latent Semantic Clusters

نویسندگان

  • Mats Rooth
  • Stefan Riezler
  • Detlef Prescher
  • Glenn Carroll
  • Christian Hying
  • Christian Scheible
چکیده

Selectional preferences (i.e., semantic restrictions on the realisation of predicate complements) are of great interest to research in Computational Linguistics, both from a lexicographic and from an applied (wrt data sparseness) perspective. This poster presents a comparison of three computational approaches to selectional preferences: (i) an intuitive distributional approach that uses second-order co-occurrence of predicates and complement properties; (ii) an EM-based clustering approach that models the strengths of predicate–noun relationships by latent semantic clusters (Rooth et al., 1999); and (iii) an extension of the latent semantic clusters by incorporating the MDL principle into the EM training, thus explicitly modelling the predicate–noun selectional preferences by WordNet classes (Schulte im Walde et al., 2008). The motivation of our work was driven by two main question: Concerning the distributional approach, we were interested not only in how well the model describes selectional preferences, but moreover which second-order properties were most salient. For example, a typical direct object of the verb drink is usually fluid, might be hot or cold, can be bought, might be bottled, etc. So are adjectives that modify nouns, or verbs that subcategorise nouns salient second-order properties to describe the selectional preferences of direct objects? Our second interest was in the actual comparison of the models: How does a very simple distributional model compare to much more complex approaches, especially with respect to model (iii) that explicitly incorporates selectional preferences?

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing Computational Models of Selectional Preferences - Second-order Co-Occurrence vs. Latent Semantic Clusters

This paper presents a comparison of three computational approaches to selectional preferences: (i) an intuitive distributional approach that uses second-order co-occurrence of predicates and complement properties; (ii) an EM-based clustering approach that models the strengths of predicate–noun relationships by latent semantic clusters; and (iii) an extension of the latent semantic clusters by i...

متن کامل

Computational Models for Chinese Selectional Preferences Induction

Selectional preference (SP) is an important kind of semantic knowledge. It can be used in various natural language processing tasks, including metaphor computing, lexicon building, syntactic structure disambiguation, word sense disambiguation, semantic role labeling, anaphora resolution, etc. This paper presents and compares two computational models for Chinese SP induction, a HowNet-based Sele...

متن کامل

Probabilistic Distributional Semantics with Latent Variable Models

We describe a probabilistic framework for acquiring selectional preferences of linguistic predicates and for using the acquired representations to model the effects of context on word meaning. Our framework uses Bayesian latent-variable models inspired by, and extending, the well-known Latent Dirichlet Allocation (LDA) model of topical structure in documents; when applied to predicate–argument ...

متن کامل

The Impact of Selectional Preference Agreement on Semantic Relational Similarity

Relational similarity is essential to analogical reasoning. Automatically determining the degree to which a pair of words belongs to a semantic relation (relational similarity) is greatly improved by considering the selectional preferences of the relation. To determine selectional preferences, we induced semantic classes through a Latent Dirichlet Allocation (LDA) method that operates on depend...

متن کامل

Latent Semantic Clustering of German Verbs with Treebank Data

Treebank data have been utilized as data sources for a wide range of tasks in computational linguistics, including statistical parsing, anaphora resolution, induction of valence lexica, etc. More recently, researchers have experimented with extracting semantic information from syntactically annotated data. Here, treebank data have been used for the purposes of identifying selectional preference...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009