A Mixture Model of Demographic Lexical Variation

نویسندگان

  • Brendan O’Connor
  • Jacob Eisenstein
  • Eric P. Xing
  • Noah A. Smith
  • Zenglin Xu
  • Irwin King
  • Shenghuo Zhu
  • Yuan Qi
  • Rong Yan
  • John Yen
چکیده

We propose a Bayesian generative model of how demographic social factors influence lexical choice. We apply the method to a corpus of geo-tagged Twitter messages originating from mobile phones, cross-referenced against U.S. Census demographic data. Our method discovers communities jointly defined by linguistic and demographic properties.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering Demographic Language Variation

We propose a Bayesian generative model of how demographic social factors influence lexical choice. We apply the method to a corpus of geo-tagged Twitter messages originating from mobile phones, cross-referenced against U.S. Census demographic data. Our method discovers communities jointly defined by linguistic and demographic properties.

متن کامل

Between versus Within-Language Differences in Linguistic Categorization

Cross-linguistic research has shown that boundaries for lexical categories differ from language to language. The aim of this study is to explore these differences between languages in relation to the categorization differences within a language. Monolingual Dutch(N=400) and French-speaking (N=300) Belgian adults provided lexical category judgments for three lexical categories that are roughly e...

متن کامل

Thermal Degradation Kinetic Study of a Fuel-rich Energetic Mixture Containing Epoxy Binder

      In this work, thermal degradation behavior of a fuel-rich energetic mixture containing epoxy binder was studied by thrmogravimetric analysis and differential scanning calorimetry under dynamic nitrogen atmosphere at different heating rates. Variation of the thermal degradation activation energy of the mixture was evaluated by differential and integral isoconversional methods via ...

متن کامل

Input-induced Variation in EFL Learners’ Oral Production in Terms of Complexity, Accuracy, and Fluency

Researchers have extensively studied phenomena that affect a second language learner’s oral production while there is scant evidence about input-related factors. Accordingly, the present study sought to investigate how variation in oral production is caused by the input they receive from different course materials. To this end, the study included a micro-evaluation study of three course materia...

متن کامل

Parallel Speaker and Content Modelling for Text-Dependent Speaker Verification

Text-dependent short duration speaker verification involves two challenges. The primary challenge of interest is the verification of the speaker’s identity, and often a secondary challenge of interest is the verification of the lexical content of the pass-phrase. In this paper, we propose the use of two systems to handle these two tasks in parallel with one subsystem modelling speaker identity ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011