Analyzing Similarity in Mathematical Content To Enhance the Detection of Academic Plagiarism

نویسنده

  • Maurice-Roman Isele
چکیده

Despite the effort put into the detection of academic plagiarism, it continues to be a ubiquitous problem spanning all disciplines. Various tools have been developed to assist human inspectors by automatically identifying suspicious documents. However, to our knowledge currently none of these tools use mathematical content for their analysis. This is problematic, because mathematical content potentially represents a significant amount of the scientific contribution in academic documents. Hence, ignoring mathematical content limits the detection of plagiarism considerably, especially in disciplines with frequent use of mathematics. This paper aims to help close this gap by providing an overview of existing approaches in mathematical information retrieval and an analysis of their applicability for different possible cases of mathematical plagiarism. I find that whereas syntax-based approaches perform particularly well in detecting undisguised plagiarism, structure-based and hybrid approaches promise to also detect forms of disguised mathematical plagiarism, such as plagiarism with renamed identifiers. However, more research in this area is needed to enable the detection of more complex mathematical plagiarism: the scope of current approaches is restricted to the formula-level, an extension to the section-level is needed. Additionally, the general detection of equivalence transformations is currently not feasible. Despite these remaining problems, I conclude that the presented approaches could already be used for a basic automated detection system targeting mathematical plagiarism and therefore enhance current plagiarism detection systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

English-Persian Plagiarism Detection based on a Semantic Approach

Plagiarism which is defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them” poses a major challenge to knowledge spread publication. Plagiarism has been placed in four categories of direct, paraphrasing (rewriting), translation, and combinatory. This paper addresses translational plagiarism which is sometimes referred to as cross-li...

متن کامل

Analyzing Semantic Concept Paerns to Detect Academic Plagiarism

Detecting academic plagiarism is a pressing problem, e.g., for educational and research institutions, funding agencies, and academic publishers. Existing plagiarism detection systems reliably identify (nearly) copied text, but o‰en fail to detect disguised forms of academic plagiarism, such as paraphrases, translations, and idea plagiarism. We present Semantic Concept PaŠern Analysis an approac...

متن کامل

Plagiarism checker for Persian (PCP) texts using hash-based tree representative fingerprinting

With due respect to the authors’ rights, plagiarism detection, is one of the critical problems in the field of text-mining that many researchers are interested in. This issue is considered as a serious one in high academic institutions. There exist language-free tools which do not yield any reliable results since the special features of every language are ignored in them. Considering the paucit...

متن کامل

A New Similarity Measure with Length Factor for Plagiarism Detection

Different similarity measures are available for comparison of textual data. These similarity measures are used for plagiarism detection. This research paper proposes a new similarity measure. Moreover, this paper proposes to consider length of content for plagiarism score determination. General Terms Data mining, plagiarism detection.

متن کامل

An introduction to the examples of scientific plagiarism and its identification soft-wares

Background: Increasing Immorality and Plagiarism in the country's higher education system has become a serious crisis. Hence, the purpose of this study was to analyze the Examples of Plagiarism and the introduction of Plagiarism detection software. Method: The present study is a narrative review study. Articles in Persian and Latin related to the use of scientific theft key words in databases w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1801.08439  شماره 

صفحات  -

تاریخ انتشار 2018