Using Group Membership Markers for Group Identification
نویسندگان
چکیده
We describe a system for automatically ranking documents by degree of militancy, designed as a tool both for finding militant websites and prioritizing the data found. We compare three ranking systems, one employing a small hand-selected vocabulary based on group membership markers used by insiders to identify members and member properties (us) and outsiders and threats (them), one with a much larger vocabulary, and another with a small vocabulary chosen by Mutual Information. We use the same vocabularies to build classifiers. The ranker that achieves the best correlations with human judgments uses the small us-them vocabulary. We confirm and extend recent results in sentiment analysis (Paltoglou and Thelwall 2010), showing that a feature-weighting scheme taken from classical IR (TFIDF) produces the best ranking system; we also find, surprisingly, that adjusting these weights with SVM training, while producing a better classifier, produces a worse ranker. Increasing vocabulary size similarly improves classification (while worsening ranking). Our work complements previous work tracking radical groups on the web (Chen 2007),which classified such sites with heterogeneous indicators. The method combines elements of machine learning and behavioral science, and should extend to any group organized for col-
منابع مشابه
When objective group membership and subjective ethnic identification don’t align: How identification shapes intergroup bias through self-enhancement and perceived threat
When objective group membership and subjective ethnic identification don’t align, which has a greater impact on how people feel towards the groups they affiliate with, and why? Deprived of many distinctiveness markers typically found in intergroup relations (e.g., physical features, obvious status differences), Taiwanese society provides a perfect natural context to explore the impact of object...
متن کاملPhenolic compounds as chemical markers of low taxonomic levels in the marine algal genus Laurencia in the Persian Gulf
The genus Laurencia(Rhodomelaceae), a complex group, has 285 species and infraspecific names. Identification and taxonomy of these taxa, mainly has been based on flexible morphological characters which have led to a complicated taxonomy in this group. Nowadays, taxonomical study of this group has changed a lot by using reproductive characters, anatomical differences and modern...
متن کاملArithmetic Aggregation Operators for Interval-valued Intuitionistic Linguistic Variables and Application to Multi-attribute Group Decision Making
The intuitionistic linguistic set (ILS) is an extension of linguisitc variable. To overcome the drawback of using single real number to represent membership degree and non-membership degree for ILS, the concept of interval-valued intuitionistic linguistic set (IVILS) is introduced through representing the membership degree and non-membership degree with intervals for ILS in this paper. The oper...
متن کاملگزینش اکوتیپهای بومی گشنیز متحمل به خشکی براساس عملکرد میوه و صفات مرتبط به کمک شاخصهای تکمتغیره و چندمتغیره
Coriander is an annual plant belonging to Apiaceae family that its yield is affected by drought stress. Three experiments (normal irrigation regime, sudden drought stress and gradual drought stress) were conducted according to a randomized complete block design with three replications in 2015 to introduce the most drought tolerant Iranian coriander ecotypes based on several economical traits, a...
متن کاملIdentifying Peer Effects Using Gold Rushers
Fishers pay attention to where other fishers are fishing, suggesting the potential for peer effects. But peer effects are difficult to identify without an exogenous shifter of peer group membership. We propose an identification strategy that exploits a shifter of peer group membership: gold rushes of new entrants. Following an exchange-rateinduced gold rush in an American fishery, we find that ...
متن کامل