Predicting Source Code Quality with Static Analysis and Machine Learning
نویسندگان
چکیده
This paper is investigating if it is possible to predict source code quality based on static analysis and machine learning. The proposed approach includes a plugin in Eclipse, uses a combination of peer review/human rating, static code analysis, and classification methods. As training data, public data and student hand-ins in programming are used. Based on this training data, new and uninspected source code can be accurately classified as “well written” or “badly written”. This is a step towards feedback in an interactive environment without peer assessment.
منابع مشابه
The use of machine learning with signal- and NLP processing of source code to fingerprint, detect, and classify vulnerabilities and weaknesses with MARFCAT
We present a machine learning approach to static code analysis and findgerprinting for weaknesses related to security, software engineering, and others using the open-source MARF framework and the MARFCAT application based on it for the NIST’s SATE 2010 static analysis tool exposition workshop.
متن کاملComparative Analysis of Random Forests with Statistical and Machine Learning Methods in Predicting Fault-Prone Classes
There are available metrics for predicting fault prone classes, which may help software organizations for planning and performing testing activities. This may be possible due to proper allocation of resources on fault prone parts of the design and code of the software. Hence, importance and usefulness of such metrics is understandable, but empirical validation of these metrics is always a great...
متن کاملA Systematic Model Building Process for Predicting
HECKMAN, SARAH SMITH. A Systematic Model Building Process for Predicting Actionable Static Analysis Alerts. (Under the direction of Laurie Williams). Automated static analysis tools can identify potential source code anomalies, like null pointers, buffer overflows, and unclosed streams that could lead to field failures. These anomalies, which we call alerts, require inspection by a developer to...
متن کاملInformation Visualization and Machine Learning Applied on Static Code Analysis
Software engineers will possibly never see the perfect source code in their lifetime, but they are seeing much better analysis tools for finding defects in software. The approaches used in static code analysis emerged from simple code crawling to usage of statistical and probabilistic frameworks. This work presents a new technique that incorporates machine learning and information visualization...
متن کاملAutomatic Forecasting of Design Anti-patterns in Software Source Code
The paper presents a framework for automatic inferring knowledge about reasons for the appearance of anti-patterns in the program source code during its development. Experiments carried out on histories of development of few open-source java projects shown that we can efficiently detect temporal patterns, which are indicators of likely appearance of future anti-pattern. The approach presented i...
متن کامل