Semantic Typology and Parallel Corpora: Something about Indefinite Pronouns

نویسندگان

  • Barend Beekhuizen
  • Julia Watson
  • Suzanne Stevenson
چکیده

Patterns of crosslinguistic variation in the expression of word meaning are informative about semantic organization, but most methods to study this are labor intensive and obscure the gradient nature of concepts. We propose an automatic method for extracting crosslinguistic co-categorization patterns from parallel texts, and explore the properties of the data as a potential source for automatically creating semantic representations for cognitive modeling. We focus on indefinite pronouns, comparing our findings against a study based on secondary sources (Haspelmath 1997). We show that using automatic methods on parallel texts contributes to more cognitively-plausible semantic representations for a domain.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exemplar semantics through parallel corpora Something about indefinite pronouns

We can determine ‘similarity’ of meaning typologically. If two particular meanings are often expressed by the same surface form (across a random sample of languages), then we can assume that the two meanings are ‘similar’ to the human mind. [. . . ] From ‘similarities’ it is a short step to maps of grammar/meaning space. We arrange different meanings on a map so that ‘similar’ meanings are clos...

متن کامل

Indefinite pronouns: A review of Haspelmath (1997)

Certain properties of indefinite pronouns have received considerable attention in the literature. For instance, negative polarity indefinites such as English any constitute the focus of the bulk of the literature on negative polarity items, the fact notwithstanding that Klima in his seminal 1964 paper did not fail to show that members of other syntactic and semantic categories may be restricted...

متن کامل

Multilingual corpora with coreferential annotation of person entities

This paper presents three corpora with coreferential annotation of person entities for Portuguese, Galician and Spanish. They contain coreference links between several types of pronouns (including elliptical, possessive, indefinite, demonstrative, relative and personal clitic and non-clitic pronouns) and nominal phrases (including proper nouns). Some statistics have been computed, showing distr...

متن کامل

The DAD Parallel Corpora and their Uses

This paper deals with the uses of the annotations of third person singular neuter pronouns in the DAD parallel and comparable corpora of Danish and Italian texts and spoken data. The annotations contain information about the functions of these pronouns and their uses as abstract anaphora. Abstract anaphora have constructions such as verbal phrases, clauses and discourse segments as antecedents ...

متن کامل

Using Optimal Classification for multidimensional scaling analysis of linguistic data

Multidimensional scaling (MDS) is a technique for visualizing the relationships among data that are similar to each other on very many dimensions. For example, meanings of words such as indefinite pronouns are similar to each other by virtue of being expressed by the same indefinite pronoun in one language or another (Haspelmath 1997). If one compares indefinite pronouns of a large number of la...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017