Hawkes Binomial Topic Model with Applications to Coupled Conflict-twitter Data
نویسندگان
چکیده
We consider the problem of modeling and clustering heterogeneous event data arising from coupled conflict event and social media data sets. In this setting conflict events trigger responses on social media and at the same time signals of grievance detected in social media may serve as leading indicators for subsequent conflict events. For this purpose we introduce the Hawkes Binomial Topic Model (HBTM) where marks, Tweets, and conflict event descriptions are represented as bags of words following a Binomial distribution. When viewed as a branching process, the daughter event bag of words is generated by randomly turning on/off parent words through independent Bernoulli random variables. We then use ExpectationMaximization to estimate the model parameters and branching structure of the process. The inferred branching structure is then used for Topic cascade detection, short-term forecasting, and investigating the causal dependence of grievance on social media and conflict events in recent elections in Nigeria and Kenya. 1. Background and Motivation. Twitter and other social media platforms have emerged as important tools for the public to communicate responses to crises and terrorist attacks [5] and more generally to communicate collectively, exchanging grievances that may catalyze mobilization [3, 24]. Research has focused on understanding public sentiment around these types of events and determining the central actors in the social network that are key to shaping public response [5] along with measuring short-term changes in the intensity of conflict using social media [29] or considering effects from regional instability [4]. At a more macro spatial-temporal scale, recent research has focused on modeling the endogenous and exogenous processes that generate thousands of terrorist and conflict events at the level of countries and years or decades. For this purpose point processes are used to model contagion effects in the risk of terrorist activity [21] and both contagion and exogenous rate fluctuations in conflict [17, 28, 27]. Because of the relative infrequency of conflict and terrorist events, having auxiliary data that can provide a signal for the risk of future events is highly desirable. While conflict events have been shown to influence overall regional instability [4], point process models of conflict to date have focused on univariate data [21, 17, 28, 27]. In this paper we use 1 imsart-aoas ver. 2014/10/16 file: main.tex date: August 6, 2017
منابع مشابه
Estimation of Count Data using Bivariate Negative Binomial Regression Models
Abstract Negative binomial regression model (NBR) is a popular approach for modeling overdispersed count data with covariates. Several parameterizations have been performed for NBR, and the two well-known models, negative binomial-1 regression model (NBR-1) and negative binomial-2 regression model (NBR-2), have been applied. Another parameterization of NBR is negative binomial-P regression mode...
متن کاملHawkes Processes for Continuous Time Sequence Classification: an Application to Rumour Stance Classification in Twitter
Classification of temporal textual data sequences is a common task in various domains such as social media and the Web. In this paper we propose to use Hawkes Processes for classifying sequences of temporal textual data, which exploit both temporal and textual information. Our experiments on rumour stance classification on four Twitter datasets show the importance of using the temporal informat...
متن کاملCatching Fire via "Likes": Inferring Topic Preferences of Trump Followers on Twitter
In this paper, we propose a framework to infer the topic preferences of Donald Trump’s followers on Twitter. We first use latent Dirichlet allocation (LDA) to derive the weighted mixture of topics for each Trump tweet. Then we use negative binomial regression to model the “likes,” with the weights of each topic serving as explanatory variables. Our study shows that attacking Democrats such as P...
متن کاملBeta - Binomial and Ordinal Joint Model with Random Effects for Analyzing Mixed Longitudinal Responses
The analysis of discrete mixed responses is an important statistical issue in various sciences. Ordinal and overdispersed binomial variables are discrete. Overdispersed binomial data are a sum of correlated Bernoulli experiments with equal success probabilities. In this paper, a joint model with random effects is proposed for analyzing mixed overdispersed binomial and ordinal longitudinal respo...
متن کاملTwitter-Network Topic Model: A Full Bayesian Treatment for Social Network and Text Modeling
Twitter data is extremely noisy – each tweet is short, unstructured and with informal language, a challenge for current topic modeling. On the other hand, tweets are accompanied by extra information such as authorship, hashtags and the user-follower network. Exploiting this additional information, we propose the Twitter-Network (TN) topic model to jointly model the text and the social network i...
متن کامل