An improved methodology on information distillation by mining program source code
نویسندگان
چکیده
This paper presents a methodology for knowledge acquisition from source code. We use data mining to support semiautomated software maintenance and comprehension and provide practical insights into systems specifics, assuming one has limited prior familiarity with these systems. We propose a methodology and an associated model for extracting information from object oriented code by applying clustering and association rules mining. K-means clustering produces system overviews and deductions, which support further employment of an improved version ofMMS Apriori that identifies hidden relationships between classes, methods and member data. The methodology is evaluated on an industrial case study, results are discussed and conclusions are drawn. 2006 Elsevier B.V. All rights reserved.
منابع مشابه
Toward Obtaining Event Logs from Legacy Code
Information systems are ageing over time and become legacy information systems which often embed business knowledge that is not present in any other artifact. This embedded knowledge must be preserved to align the modernized versions of the legacy systems with the current business processes of an organization. Process mining is a powerful tool to discover and preserve business knowledge. Most p...
متن کاملComparison and evaluation of source code
Program source code substantially is structured and contains semantically rich programming constructs such as 6 variables, functions, data structures, and program structures which indicate patterns. Mining source code by using different data 7 mining techniques to extract the valuable hidden patterns is the new revolution in software engineering. Over last decade many 8 tools and techniques hav...
متن کاملChemical Process Modeling in Modelica
Chemical process models are highly structured. Information on how the hierarchical components are connected helps to solve the model efficiently. Our ultimate goal is to develop structure-driven optimization methods for solving nonlinear programming problems (NLP). The structural information retrieved from the JModelica environment will play an important role in the development of our novel opt...
متن کاملMining Source Code for Design Regularities
The aim of this working session on Industrial Realities of Program Comprehension is to exchange and discuss experiences, opportunities, challenges and strategies for the application of program comprehension techniques in industry. In this position paper we focus on a potentially interesting opportunity and challenge for adopting program comprehension techniques, and source code mining technique...
متن کاملMining of Source Code Concepts and Idioms An Approach based on Clone Detection Techniques
This paper introduces a new view on program source code with a focus on code clone information. An algorithm is presented that transforms source code into an equivalent representation which expresses code redundancies as hierarchical clone classes explicitly. This representation supports program comprehension by pointing out arbitrary programming idioms and the frequencies of their occurrences ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Data Knowl. Eng.
دوره 61 شماره
صفحات -
تاریخ انتشار 2007