Visualizing and Understanding Code Duplication in Large Software Systems

نویسندگان

Zhen Ming Jiang

Richard C. Holt

Ahmed E. Hassan

چکیده

Code duplication, or code cloning, is a common phenomena in the development of large software systems. Developers have a love-hate relationship with cloning. On one hand, cloning speeds up the development process. On the other hand, clone management is a challenging task as software evolves. Cloning has commonly been considered as undesirable for software maintenance and several research efforts have been devoted to automatically detect clones and eliminate clones aggressively. However, there is little empirical work done to analyze the consequences of cloning with respect to the software quality. Recent studies show that cloning is not necessarily undesirable. Cloning can used to minimize risks and there are cases where cloning is used as a design technique. In this thesis, three visualization techniques are proposed to aid researchers in analyzing cloning in studying large software systems. All of the visualizations abstract and display cloning information at the subsystem level but with different emphases. At the subsystem level, clones can be classified as external clones and internal clones. External clones refer to code duplicates that reside in the same subsystem, whereas external clones are clones that are spread across different subsystems. Software architecture quality attributes such as cohesion and coupling are introduced to contribute to the study of cloning at the architecture level. The Clone Cohesion and Coupling (CCC) Graph and the Clone System Hierarchy (CSH) Graph display the cloning information for one single release. In particular, the CCC Graph highlights the amount of internal and external cloning for each subsystems; whereas the CSH Graph focuses more on the details of the spread of cloning. Finally, the Clone System Evolution (CSE) Graph shows the evolution of cloning over a period of time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Visualizing Similarities in Execution Traces

The analysis of execution traces is a common practice in the context of software understanding. A major issue during this task is scalability, as the massive amounts of data often make the comprehension process difficult. A significant portion of this data overload can be attributed to repetitions that are caused by, for example, iterations in the software’s source code. In this position paper,...

متن کامل

A partition-based algorithm for clustering large-scale software systems

Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...

متن کامل

On Finding Duplication and Near-Duplication in Large Software Systems

This paper describes how a program called dup can be used to locate instances of duplication or nearduplication in a software system. D u p reports both textually identical sections of code and sections that are the same textually except for systematic substitution of one set of variable names and constants for another. Further processing locates longer sections of code that are the same except...

متن کامل

Visualizing Object-oriented Software for Understanding and Documentation

Understanding or comprehending source code is one of the core activities of software engineering. Understanding objectoriented source code is essential and required when a programmer maintains, migrates, reuses, documents or enhances source code. The source code that is not comprehended cannot be changed. The comprehension of object-oriented source code is a difficult problem solving process. I...

متن کامل

Reverse Engineering by Visualizing and Querying

The automatic extraction of high-level structural information from code is important for both software maintenance and reuse. Instead of using specialpurpose tools, we explore the use of a general-purpose data visualization system called Hy+ for querying and visualizing information about object-oriented software systems. Hy+ supports visualization and visual querying of arbitrary graph-like dat...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Visualizing and Understanding Code Duplication in Large Software Systems

نویسندگان

چکیده

منابع مشابه

Visualizing Similarities in Execution Traces

A partition-based algorithm for clustering large-scale software systems

On Finding Duplication and Near-Duplication in Large Software Systems

Visualizing Object-oriented Software for Understanding and Documentation

Reverse Engineering by Visualizing and Querying

عنوان ژورنال:

اشتراک گذاری