Pastiche Detection Based on Stopword Rankings. Exposing Impersonators of a Romanian Writer
نویسندگان
چکیده
We applied hierarchical clustering using Rank distance, previously used in computational stylometry, on literary texts written by Mateiu Caragiale and a number of different authors who attempted to impersonate Caragiale after his death, or simply to mimic his style. Their pastiches were consistently clustered opposite to the original work, thereby confirming the performance of the method and proposing an extension of the method from simple authorship attribution to the more complicated problem of pastiche detection. The novelty of our work is the use of frequency rankings of stopwords as features, showing that this idea yields good results for pastiche detection.
منابع مشابه
Parameters for Topic Boundary Detection in Multi-Party Dialogues
We present a topic boundary detection method that searches for connections between sequences of utterances in multi party dialogues. The connections are established based on word identity. We compare our method to a state-of-the art automatic topic boundary detection method that was also used on multi party dialogues. We checked various methods of preprocessing of the data, including stemming, ...
متن کاملThe Tash her Father Wore : World Literature ,
This article studies the Turkish-German writer Kemal Kurt’s Ja, sagt Molly (1998) [‘Yes, says Molly’], an ironic meta-fiction to which little critical attention has been paid. Kurt questions the representation of Turks as untutored aspirants to Western culture and challenges the traditional images of exclusion and discrimination. Through a study of his use of pastiche and references to World Li...
متن کاملAuthorship Identification of Romanian Texts with Controversial Paternity
In this work we propose a new strategy for the authorship identification problem and we test it on an example from Romanian literature: did Radu Albala found the continuation of Mateiu Caragiale’s novel ”Sub pecetea tainei”, or did he write himself the respective continuation? The proposed strategy is based on the similarity of rankings of function words; we compare the obtained results with th...
متن کاملChaperones and Impersonators: Run-time Support for Contracts on Higher-Order, Stateful Values
Racket’s chaperone and impersonator language constructs provide run-time support for implementing higher-order contracts on mutable structures and abstract datatypes. Using chaperones and impersonators, contracts on mutable data can be enforced without changing the API to that data; contracts on large data structures can be checked lazily on only the accessed parts of the structure; contracts o...
متن کاملAutomatically Building a Stopword List for an Information Retrieval System
Words in a document that are frequently occurring but meaningless in terms of Information Retrieval (IR) are called stopwords. It is repeatedly claimed that stopwords do not contribute towards the context or information of the documents and they should be removed during indexing as well as before querying by an IR system. However, the use of a single fixed stopword list across different documen...
متن کامل