Intrinsic Detection of Plagiarism based on Writing Style Grouping

نویسندگان

  • Maryam Elamine
  • Seifeddine Mechti
  • Lamia Hadrich Belguith
چکیده

In this paper, we tackle the task of intrinsic plagiarism detection, also referred to as author diarization. This task deals with identifying segments within a document written by multiple authors [2]. The main goal is to discover deviations in the writing style, looking for parts of the document that could potentially be written by another person [4]. In this paper, we present our hybrid approach that constructs a style function from stylometric features and detects the outliers. The proposed approach has been evaluated on two publicly available corpora. The obtained results outperform the ones obtained by the best state-of-the-art methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text mining applied to plagiarism detection: The use of words for detecting deviations in the writing style

Please cite this article in press as: Oberreuter, G in the writing style. Expert Systems with Applica Plagiarism detection is of special interest to educational institutions, and with the proliferation of digital documents on the Web the use of computational systems for such a task has become important. While traditional methods for automatic detection of plagiarism compute the similarity measu...

متن کامل

Authorship and Plagiarism Detection Using Binary BOW Features

Identifying writing style shifts and variations are fundamental capabilities when addressing authorship related tasks. In this work we examine a simplified approach for unsupervised authorship and plagiarism detection which is based on binary bag of words representation. We evaluate our approach using PAN-2012 Authorship Attribution challenge data, which includes both open/closed class authorsh...

متن کامل

Intrinsic Plagiarism Analysis with Meta Learning

In intrinsic plagiarism analysis we are given a document, allegedly written by a single author, and the task is to find sufficient evidence either to accept or to reject this hypothesis. Existing research to intrinsic plagiarism analysis tries to quantify changes in the writing style by analyzing the distributions of particular style markers. This way, acceptable detection rates can be achieved...

متن کامل

Methods for Intrinsic Plagiarism Detection and Author Diarization

The paper investigates methods for intrinsic plagiarism detection and author diarization. We developed a plagiarism detection method based on constructing an author style function from features of text sentences and detecting outliers. We adapted the method for the diarization problem by segmenting author style statistics on text parts, which correspond to different authors. Both methods were t...

متن کامل

Intrinsic Plagiarism Detection

Current research in the field of automatic plagiarism detection for text documents focuses on algorithms that compare plagiarized documents against potential original documents. Though these approaches perform well in identifying copied or even modified passages, they assume a closed world: a reference collection must be given against which a plagiarized document can be compared. This raises th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017