Discovering Factions in the Computational Linguistics Community

نویسندگان

  • Yanchuan Sim
  • Noah A. Smith
  • David A. Smith
چکیده

We present a joint probabilistic model of who cites whom in computational linguistics, and also of the words they use to do the citing. The model reveals latent factions, or groups of individuals whom we expect to collaborate more closely within their faction, cite within the faction using language distinct from citation outside the faction, and be largely understandable through the language used when cited from without. We conduct an exploratory data analysis on the ACL Anthology. We extend the model to reveal changes in some authors’ faction memberships over time.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predicting Responses and Discovering Social Factors in Scientific Literature

We consider the problem of predicting measurable responses to scientific articles based primarily on their text content. Specifically, we consider papers in two fields (economics and computational linguistics) and make predictions about downloads and within-community citations. Our first two models investigate temporal and spatial aspects of scientific community’s interests. A third model which...

متن کامل

Data Mining Meets Collocations Discovery

In this paper we discuss the problem of discovering interesting word sequences in the light of two traditions: sequential pattern mining (from data mining) and collocations discovery (from computational linguistics). Smadja (1993) defines a collocation as “a recurrent combination of words that cooccur more often than chance and that correspond to arbitrary word usages.” The notion of arbitrarin...

متن کامل

Attitudes in Iranian vs. Western Media Coverage of the Iranian Nuclear Issue

Employing the appraisal framework in discovering the way ideology is crystalized through discourse, the present study attempts to investigate how journalistic ideologies and political positions are manifested through attitudinal terms. Referring to White’s (2012) distinction of attitude types, inscribed vs. invoked, based on Martin and White’s (2005) appraisal theory, journalistic ideology toge...

متن کامل

Discovering Parallel Text from the World Wide Web

Parallel corpus is a rich linguistic resource for various multilingual text management tasks, including crosslingual text retrieval, multilingual computational linguistics and multilingual text mining. Constructing a parallel corpus requires effective alignment of parallel documents. In this paper, we develop a parallel page identification system for identifying and aligning parallel documents ...

متن کامل

Arabic Rhetorical Relations Extraction for Answering "Why" and "How to" Questions

In the current study we aim at exploiting discourse structure of Arabic text to automatically finding answers to non-factoid questions ("Why" and "How to"). Our method is based on Rhetorical Structure Theory (RST) that many studies have shown to be a very effective approach for many computational linguistics applications such as (text generation, text summarization and machine translation). For...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012