Linguistic Models for Analyzing and Detecting Biased Language
نویسندگان
چکیده
Unbiased language is a requirement for reference sources like encyclopedias and scientific texts. Bias is, nonetheless, ubiquitous, making it crucial to understand its nature and linguistic realization and hence detect bias automatically. To this end we analyze real instances of human edits designed to remove bias from Wikipedia articles. The analysis uncovers two classes of bias: framing bias, such as praising or perspective-specific words, which we link to the literature on subjectivity; and epistemological bias, related to whether propositions that are presupposed or entailed in the text are uncontroversially accepted as true. We identify common linguistic cues for these classes, including factive verbs, implicatives, hedges, and subjective intensifiers. These insights help us develop features for a model to solve a new prediction task of practical importance: given a biased sentence, identify the bias-inducing word. Our linguistically-informed model performs almost as well as humans tested on the same task.
منابع مشابه
Design and Implementation of a Software System for Detecting Orthographical or Morphological Errors in Persian Words
This paper presents a new method for analyzing words in the Persian language context to find orthographical and structural errors regardless of the meaning. This technique tokenizes each word in a statement then tries to detect the kind of word, and analyses its correctness in terms of orthography and morphology by means of a lexicon. It should be noted that some words in the Persian language h...
متن کاملThe Representation of Iran’s Nuclear Program in British Newspaper Editorials: A Critical Discourse Analytic Perspective
In this study, Van Dijk’s (1998) model of CDA was utilized in order to examine the representation of Iran’s nuclear program in editorials published by British news casting companies. The analysis of the editorials was carried out at two levels of headlines and full text stories with regard to the linguistic features of lexical choices, nominalization, passivization, overcompleteness, and voice....
متن کاملEditorial Volume 5, Issue 2
Our Journal's tendency towards the real world in applied linguistics and literary studies should have significant epistemological and methodological consequences in researching the fields. The interest in the real world makes the problems we may have in our everyday lives our 'points of departure' in research. According to my experience of research in our universities throughout their history, ...
متن کاملLinguistic-based Patterns for Figurative Language Processing: The Case of Humor Recognition and Irony Detection
Figurative language represents one of the most difficult tasks regarding natural language processing. Unlike literal language, figurative language takes advantage of linguistic devices such as irony, humor, sarcasm, metaphor, analogy, and so on, in order to communicate indirect meanings which, usually, are not interpretable by simply decoding syntactic or semantic information. Rather, figurativ...
متن کاملCausal Correlations between Genes and Linguistic Features – the Mechanism of Gradual Language Evolution
The causal correlations between human genetic variants and linguistic (typological) features could represent the mechanism required for gradual, accretionary models of language evolution. The causal link is mediated by the process of cultural transmission of language across generations in a population of genetically biased individuals. The particular case of Tone, ASPM and Microcephalin is disc...
متن کامل