Hawkes Binomial Topic Model with Applications to Coupled Conflict-twitter Data

نویسندگان

  • George Mohler
  • Erin McGrath
  • Cody Buntain
  • Gary LaFree
  • G. MOHLER
چکیده

We consider the problem of modeling and clustering heterogeneous event data arising from coupled conflict event and social media data sets. In this setting conflict events trigger responses on social media and at the same time signals of grievance detected in social media may serve as leading indicators for subsequent conflict events. For this purpose we introduce the Hawkes Binomial Topic Model (HBTM) where marks, Tweets, and conflict event descriptions are represented as bags of words following a Binomial distribution. When viewed as a branching process, the daughter event bag of words is generated by randomly turning on/off parent words through independent Bernoulli random variables. We then use ExpectationMaximization to estimate the model parameters and branching structure of the process. The inferred branching structure is then used for Topic cascade detection, short-term forecasting, and investigating the causal dependence of grievance on social media and conflict events in recent elections in Nigeria and Kenya. 1. Background and Motivation. Twitter and other social media platforms have emerged as important tools for the public to communicate responses to crises and terrorist attacks [5] and more generally to communicate collectively, exchanging grievances that may catalyze mobilization [3, 24]. Research has focused on understanding public sentiment around these types of events and determining the central actors in the social network that are key to shaping public response [5] along with measuring short-term changes in the intensity of conflict using social media [29] or considering effects from regional instability [4]. At a more macro spatial-temporal scale, recent research has focused on modeling the endogenous and exogenous processes that generate thousands of terrorist and conflict events at the level of countries and years or decades. For this purpose point processes are used to model contagion effects in the risk of terrorist activity [21] and both contagion and exogenous rate fluctuations in conflict [17, 28, 27]. Because of the relative infrequency of conflict and terrorist events, having auxiliary data that can provide a signal for the risk of future events is highly desirable. While conflict events have been shown to influence overall regional instability [4], point process models of conflict to date have focused on univariate data [21, 17, 28, 27]. In this paper we use 1 imsart-aoas ver. 2014/10/16 file: main.tex date: August 6, 2017

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimation of Count Data using Bivariate Negative Binomial Regression Models

Abstract Negative binomial regression model (NBR) is a popular approach for modeling overdispersed count data with covariates. Several parameterizations have been performed for NBR, and the two well-known models, negative binomial-1 regression model (NBR-1) and negative binomial-2 regression model (NBR-2), have been applied. Another parameterization of NBR is negative binomial-P regression mode...

متن کامل

Hawkes Processes for Continuous Time Sequence Classification: an Application to Rumour Stance Classification in Twitter

Classification of temporal textual data sequences is a common task in various domains such as social media and the Web. In this paper we propose to use Hawkes Processes for classifying sequences of temporal textual data, which exploit both temporal and textual information. Our experiments on rumour stance classification on four Twitter datasets show the importance of using the temporal informat...

متن کامل

Catching Fire via "Likes": Inferring Topic Preferences of Trump Followers on Twitter

In this paper, we propose a framework to infer the topic preferences of Donald Trump’s followers on Twitter. We first use latent Dirichlet allocation (LDA) to derive the weighted mixture of topics for each Trump tweet. Then we use negative binomial regression to model the “likes,” with the weights of each topic serving as explanatory variables. Our study shows that attacking Democrats such as P...

متن کامل

Beta - Binomial and Ordinal Joint Model with Random Effects for Analyzing Mixed Longitudinal Responses

The analysis of discrete mixed responses is an important statistical issue in various sciences. Ordinal and overdispersed binomial variables are discrete. Overdispersed binomial data are a sum of correlated Bernoulli experiments with equal success probabilities. In this paper, a joint model with random effects is proposed for analyzing mixed overdispersed binomial and ordinal longitudinal respo...

متن کامل

Twitter-Network Topic Model: A Full Bayesian Treatment for Social Network and Text Modeling

Twitter data is extremely noisy – each tweet is short, unstructured and with informal language, a challenge for current topic modeling. On the other hand, tweets are accompanied by extra information such as authorship, hashtags and the user-follower network. Exploiting this additional information, we propose the Twitter-Network (TN) topic model to jointly model the text and the social network i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017