Patterns of local discourse coherence as a feature for authorship attribution
نویسندگان
چکیده
We define a model of discourse coherence based on Barzilay and Lapata’s entity grids as a stylometric feature for authorship attribution. Unlike standard lexical and character-level features, it operates at a discourse (cross-sentence) level. We test it against and in combination with standard features on nineteen booklength texts by nine nineteenth-century authors. We find that coherence alone performs often as well as and sometimes better than standard features, though a combination of the two has the highest performance overall. We observe that despite the difference in levels, there is a correlation in performance of the two kinds of features. .................................................................................................................................................................................
منابع مشابه
The Relationship between Local and Global Coherence and Cognitive Processes in Persian-speaking Elderly Population
Objective: Many studies have suggested that there is a relationship between coherence and cognitive processes. This study aims at investigating this hypothesis through assessing the relationship between cognitive variables and coherence in the discourse of two groups of Persian-speaking younger and older adults. Methods: In order to evaluate our participants' cognitive capabilities, we recrui...
متن کاملEnhancing Authorship Attribution By Utilizing Syntax Tree Profiles
The aim of modern authorship attribution approaches is to analyze known authors and to assign authorships to previously unseen and unlabeled text documents based on various features. In this paper we present a novel feature to enhance current attribution methods by analyzing the grammar of authors. To extract the feature, a syntax tree of each sentence of a document is calculated, which is then...
متن کاملAutomatic Authorship Detection Using Textual Patterns Extracted from Integrated Syntactic Graphs
We apply the integrated syntactic graph feature extraction methodology to the task of automatic authorship detection. This graph-based representation allows integrating different levels of language description into a single structure. We extract textual patterns based on features obtained from shortest path walks over integrated syntactic graphs and apply them to determine the authors of docume...
متن کاملLocal gradient pattern - A novel feature representation for facial expression recognition
Many researchers adopt Local Binary Pattern for pattern analysis. However, the long histogram created by Local Binary Pattern is not suitable for large-scale facial database. This paper presents a simple facial pattern descriptor for facial expression recognition. Local pattern is computed based on local gradient flow from one side to another side through the center pixel in a 3x3 pixels region...
متن کاملAuthorship Attribution of Micro-Messages
Work on authorship attribution has traditionally focused on long texts. In this work, we tackle the question of whether the author of a very short text can be successfully identified. We use Twitter as an experimental testbed. We introduce the concept of an author’s unique “signature”, and show that such signatures are typical of many authors when writing very short texts. We also present a new...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- LLC
دوره 29 شماره
صفحات -
تاریخ انتشار 2014