Towards Maintenance Support for Idiom-based Code Using Sequential Pattern Mining

نویسندگان

  • Tatsuya Miyake
  • Takashi Ishio
  • Koji Taniguchi
  • Katsuro Inoue
چکیده

Developers often use an idiom to implement a concern. When a fault is found in an idiom, developers have to find all source code fragments derived from the original. While code-clone detection tools can detect copy-andpasted code, such tools cannot detect code fragments modified after pasted. We are investigating a sequential pattern mining approach to capture idiom-based code that spread across modules. This position paper shows our approach and the result of a preliminary case study. 1. Motivation: Idiom-based Coding To develop a large scale software, developers use idioms to implement a particular kind of concerns that are not modularized in the software [9]. They obtain idioms from the source code of their software, the coding standard of their team and other available resources. This idiom-based coding leads many instances of an idiom crosscutting modules; a fault in an idiom is also copied and spreads across modules. When developers found a fault in an idiom, developers have to inspect all instances of the idiom to fix the problem [2, 3, 5]. Inspecting all instances of an idiom is time consuming since developers derive various code for each programming context from an original idiom. Although code clone detection tools such as CCFinder can find all copy-and-pasted code and some variants that are very similar to the original code [7], code clone tools cannot cover all derived code fragments that are no longer code clones. Developers may use other keyword search tools but developers are hard to assure that their inspection is completed. Developers need a tool to find idiom-based code that are similar to one another and contribute to one specific conCode derived from the original idiom An original idiom Code clone Sequential pattern (developers have to inspect) Figure 1. Idiom-based code and code clone cern. This problem is in the boundary area between code clone analysis and aspect mining since both code clone analysis and aspect mining techniques detect homogeneous code crosscutting modules to support software maintenance [4, 8]. We are planning to investigate the combination of code clone analysis, sequential pattern mining and other techniques to find all instances of an idiom of interest to developers as shown in Figure 1. As a first step of the research, we are investigating whether or not a sequential pattern mining approach detect frequent coding patterns based on idioms. We have applied PrefixSpan [10], which is a sequential pattern mining algorithm, to JHotDraw as a preliminary case study. The result shows that the sequential pattern mining extracted several idiom-based patterns that implement crosscutting concerns such as undo functionality in JHotDraw. We will combine our sequential pattern mining approach with code clone analysis to support developers to inspect all for (Iterator it=list.iterator(); it.hasNext(); ) { Item item = (Item)it.next(); if (item.isActive()) { item.deactivate(); } } Collection.iterator() Iterator.hasNext() LOOP Iterator.next() Item.isActive() IF Item.deactivate() END-IF END-LOOP Figure 2. A sequence extracted from source code instances of an idiom. Our approach is related to Fluid AOP [6]; we focus on finding and managing crosscutting code rather than directly modularize crosscutting code as an aspect. The structure of the paper is following. Section 2 describes our sequential pattern mining approach for a Java program. Section 3 shows the result of case study on JHotDraw. Section 4 summarizes our current state and future directions. 2. Sequential Pattern Mining Sequential pattern mining extracts frequent subsequences from a sequence database [1]. We applied PrefixSpan [10] to a sequence database extracted from Java software. We translated the source code of a Java method into a sequence that comprises method call, IF and LOOP elements since an idiom is a small code fragments including method calls and several control blocks. Figure 2 is an example of a sequence extracted from a source code fragment. Method call element A method call is translated into a call element. To handle dynamic binding, an element has a reference to a class that declares the method in the class hierarchy. For example, a method call to String.equals(Object) and List.equals(Object) are not distinguished; a method call element Object.equals(Object) is generated for each call. IF/ELSE/END-IF element An if statement is translated into a pair of IF and END-IF elements. If the predicate of the statement calls a method, a method call element corresponding the method call is inserted before the IF element. LOOP/END-LOOP element for and while statements are translated into a pair of a LOOP and an END-LOOP elements. A method call in a predicate of the loop is translated into a method call element inserted before the LOOP element. We focus on the syntactic structure of a loop instead of precise control-flow information. We ignore break, continue and return statements in a loop. Concern Support Class Iteration (1) 54 31 Undo (1) 14 14 Undo (2) 12 12 selection of figures 10 10 selection of figures 9 9 Iteration (2) 8 8 Updating views 6 6 Handling mouse input 6 6 Manipulating polygons 6 1 Drawing image 6 1 Table 1. Patterns extracted from JHotDraw PrefixSpan extracts frequent subsequences from the database for a Java program. Several extracted patterns form a group that shares the common set of methods. For example, a pattern “Collection.iterator, LOOP, Iterator.hasNext, Iterator.next, END-LOOP” implies shorter patterns such as “LOOP, Iterator.hasNext, Iterator.next, END-LOOP” and “Iterator.hasNext, Iterator.next”. We aggregate these patterns to a single pattern group. 3. Sequential Patterns in JHotDraw To evaluate whether sequential patterns can capture idiom-based code or not, we have conducted a preliminary case study. We applied the sequential pattern mining method described in the previous section to JHotDraw, that is a drawing application and well studied in AOP community. JHotDraw comprises 2,900 methods. Its total size is 18,000 lines of code. We extracted method call patterns that involve at least 4 elements and have at least 4 instances in the program. As a result, we extracted 38 method call patterns. We have manually investigated what concerns the extracted patterns implement. Table 1 shows the top 10 frequent patterns extracted from JHotDraw. A row represents a pattern. The first column Concern is the name of the concern that the pattern implements. The second column Support indicates the number of instances of the pattern. The third column Class shows the number of classes that involve one or more instances of the pattern in their methods. 3.1. Patterns of a Crosscutting Concern We found 22 patterns related to crosscutting concerns in 38 extracted patterns. We recognized two features in these patterns. org.jhotdraw.standard. DuplicateCommand public void execute() { super.execute(); setUndoActivity(createUndoActivity()); FigureSelection selection = view().getFigureSele・・・ // create duplicate figure(s) FigureEnumeration figures = (FigureEnumeration) ・・ getUndoActivity().setAffectedFigures(figures); view().clearSelection(); ・・・・・・・・・・・・・・・・・・・・・・・・・・・・・ } setUndoActivity() createUndoActivity() getUndoActivityI() setAffectedFigures() org.jhotdraw.figures.BorderTool public void action(Figure figure) { // Figure replaceFigure = drawing().replace(figur ・・ setUndoActivity(createUndoActivity()); List l = CollectionsFactory.current().createList(); l.add(figure); l.add(new BorderDecorator(figure)); getUndoActivity().setAffectedFigures(new Fig ・・ ((BorderTool.UndoActivity)getUndoActivity()).repl ・・ } org.jhotdraw. standard.ResizeHandle public void invokeStart(int x, int y, DrawingView view) { setUndoActivity(createUndoActivity(view)); getUndoActivity().setAffectedFigures(new Sing ・・ ((ResizeHandle.UndoActivity)getUndoActivity()).se・・・ } AbstractCommand AbstractTool

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CISpan: Comprehensive Incremental Mining Algorithms of Closed Sequential Patterns for Multi-Versional Software Mining

Recently, frequent sequential pattern mining algorithms have been widely used in software engineering field to mine various source code or specification patterns. In practice, software evolves from one version to another in its life span. The effort of mining frequent sequential patterns across multiple versions of a software can be substantially reduced by efficient incremental mining. This pr...

متن کامل

Reliability-based maintenance scheduling of powered supports in Tabas mechanized coal mine

Utilizing the gathered failure data and failure interval data from Tabas coal mine in two years, this paper discusses the reliability of powered supports. The data sets were investigated using statistical procedures and in two levels: the existence of trend and serial correlation. The results show that the powered supports follow the Gamma reliability function. The reliability of the machine de...

متن کامل

Mining API Usage Patterns by Applying Method Categorization to Improve Code Completion

Developers often face difficulties while using APIs. API usage patterns can aid them in using APIs efficiently, which are extracted from source code stored in software repositories. Previous approaches have mined repositories to extract API usage patterns by simply applying data mining techniques to the collection of method invocations of API objects. In these approaches, respective functional ...

متن کامل

Efficient Constraint-Based Sequential Pattern Mining Using Dataset Filtering Techniques

Basic formulation of the sequential pattern discovery problem assumes that the only constraint to be satisfied by discovered patterns is the minimum support threshold. However, very often users want to restrict the set of patterns to be discovered by adding extra constraints on the structure of patterns. Data mining systems should be able to exploit such constraints to speed-up the mining proce...

متن کامل

Bridging the Gap between Research and Business in Software Maintenance

Software support, management, and evolution (SSME) in the coming decade and beyond ... opportunities and challenges p. 10 Refactoring a Java code base to AspectJ : an illustrative example p. 17 Automated refactoring of object oriented code into aspects p. 27 Isolating idiomatic crosscutting concerns p. 37 Defining maintainable components in the design phase p. 49 Reducing build time through pre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007