Style based Authorship Attribution on English Editorial Documents
نویسندگان
چکیده
The aim of the authorship attribution is identification of the author/s of unknown document(s). Every author has a unique style of writing pattern. The present paper identifies the unique style of an author(s) using lexical stylometric features. The lexical feature vectors of various authors are used in the supervised machine learning algorithms for predicting the unknown document. The highest average accuracy achieved is 97.22 using SVM algorithm.
منابع مشابه
Questioned Electronic Documents : Empirical Studies in Authorship Attribution
Forensic analysis of questioned electronic documents is very difficult, because the nature of the documents eliminates many kinds of informative differences. Recent work in authorship attribution demonstrates the practicality of analyzing documents based on authorial style, but the state of the art is confusing. Analyses are difficult to apply, little is known about type or rate of errors, and ...
متن کاملShallow Text Analysis and Machine Learning for Authorship At- tribution
Current advances in shallow parsing and machine learning allow us to use results from these fields in a methodology for Authorship Attribution. We report on experiments with a corpus that consists of newspaper articles about national current affairs by different journalists from the Belgian newspaper De Standaard. Because the documents are in a similar genre, register, and range of topics, toke...
متن کاملStyle-Markers in Authorship Attribution A Cross-Language Study of the Authorial Fingerprint
Th e present study addresses one of the theoretical problems of computer-assisted authorship attribution, namely the question which traceable features of language can betray authorial uniqueness (a stylistic fi ngerprint) of literary texts. A number of recent approaches show that apart from lexical measures — especially those relying on the frequencies of the most frequent words — also some oth...
متن کاملShallow Text Analysis and Machine Learning for Authorship Attribtion
Current advances in shallow parsing and machine learning allow us to use results from these fields in a methodology for Authorship Attribution. We report on experiments with a corpus that consists of newspaper articles about national current affairs by different journalists from the Belgian newspaper De Standaard. Because the documents are in a similar genre, register, and range of topics, toke...
متن کاملAuthor Profiling using LDA and Maximum Entropy Notebook for PAN at CLEF 2013
This paper describes the traditional authorship attribution subtask of the PAN/CLEF 2013 workshop. In our attempt to classify the documents based on gender and age of an author, we have applied a traditional approach of topic modeling using Latent Dirichlet Allocation[LDA]. We used the content based features like topics and style based features like preposition-frequencies, which act as the eff...
متن کامل