Early prediction of the duration of protests using probabilistic Latent Dirichlet Allocation and Decision Trees

نویسندگان

  • Satyakama Paul
  • Madhur Hasija
  • Tshilidzi Marwala
چکیده

Protests and agitations are an integral part of every democratic civil society. In recent years, South Africa has seen a large increase in its protests. The objective of this paper is to provide an early prediction of the duration of protests from its free flowing English text description. Free flowing descriptions of the protests help us in capturing its various nuances such as multiple causes, courses of actions etc. Next we use a combination of unsupervised learning (topic modeling) and supervised learning (decision trees) to predict the duration of the protests. Our results show a high degree (close to 90%) of accuracy in early prediction of the duration of protests. We expect the work to help police and other security services in planning and managing their resources in better handling protests in future.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation

Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...

متن کامل

Probabilistic Distributional Semantics with Latent Variable Models

We describe a probabilistic framework for acquiring selectional preferences of linguistic predicates and for using the acquired representations to model the effects of context on word meaning. Our framework uses Bayesian latent-variable models inspired by, and extending, the well-known Latent Dirichlet Allocation (LDA) model of topical structure in documents; when applied to predicate–argument ...

متن کامل

Comment Data Mining to Estimate Student Performance Considering Consecutive Lessons

The purpose of this study is to examine different formats of comment data to predict student performance. Having students write comment data after every lesson can reflect students’ learning attitudes, tendencies and learning activities involved with the lesson. In this research, Latent Dirichlet Allocation (LDA) and Probabilistic Latent Semantic Analysis (pLSA) are employed to predict student ...

متن کامل

Predicting student outcomes from unstructured data

We investigated the validity of applying topic modeling to unstructured student text data from online class discussion forums to predict students’ final grades. Using only student discussion data from introductory courses in biology and economics, both probabilistic latent semantic analysis (pLSA) and hierarchical latent Dirichlet allocation (hLDA) produced significantly better than chance pred...

متن کامل

Language model adaptation using latent dirichlet allocation and an efficient topic inference algorithm

We present an effort to perform topic mixture-based language model adaptation using latent Dirichlet allocation (LDA). We use probabilistic latent semantic analysis (PLSA) to automatically cluster a heterogeneous training corpus, and train an LDAmodel using the resultant topicdocument assignments. Using this LDA model, we then construct topic-specific corpora at the utterance level for interpol...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1711.00462  شماره 

صفحات  -

تاریخ انتشار 2017