Kernel Based Asymmetric Learning for Software Defect Prediction
نویسندگان
چکیده
Software defect prediction is to predict the defect-prone modules for the next release of software or cross project software. Real world data mining applications, including software defect prediction domain, must address the issue of learning from imbalanced data sets. As pointed out by Khoshgoftaar et al. [1] and Menzies et al. [2], the majority of defects in a software system are located in a small percentage of the program modules, software defect data sets are highly class imbalanced. Existing approaches to solving class imbalanced problem mainly include data sampling methods and adaptive algorithm methods. The results reported in [3] show that AdaBoost almost always outperforms even the best data sampling techniques in software defect prediction. AdaBoost is a typical adaptive algorithm, and has received a good deal of attention since being introduced by Freund and Schapire [4]. It attempts to reduce the bias generated by majority class data, by updating the weights of instances dynamically according to the errors in previous learning. Besides these methods, some studies improved dimension reduction methods for the class imbalanced problem [5], [6], [7]. Most recently, Qu et al. [7] proposed an asymmetric partial least squares classifier (APLSC) to tackle the class imbalance problem. They suggested that APLSC outperform other existing algorithms, because it can extract favorable features for unbalanced classification. However it is a bilinear classifier, in which the dimension is mapped to a bilinear subspace. In this paper, we develop a kernel based asymmetric learning method, called Asymmetric Kernel Principal Component Classification (AKPCC), which is more adaptive to general situations.
منابع مشابه
On Software Defect Prediction Using Machine Learning
The goal of this paper is to catalog the software defect prediction using machine learning. Over the last few years, the eld of software defect prediction has been extensively studied because of it's crucial position in the area of software reliability maintenance, software cost estimation and software quality assurance. An insurmountable problem associated with software defect prediction is th...
متن کاملKernel CCA Based Transfer Learning for Software Defect Prediction
An transfer learning method, called Kernel Canonical Correlation Analysis plus (KCCA+), is proposed for heterogeneous Crosscompany defect prediction. Combining the kernel method and transfer learning techniques, this method improves the performance of the predictor with more adaptive ability in nonlinearly separable scenarios. Experiments validate its effectiveness. key words: machine learning,...
متن کاملEnsemble Kernel Learning Model for Prediction of Time Series Based on the Support Vector Regression and Meta Heuristic Search
In this paper, a method for predicting time series is presented. Time series prediction is a process which predicted future system values based on information obtained from past and present data points. Time series prediction models are widely used in various fields of engineering, economics, etc. The main purpose of using different models for time series prediction is to make the forecast with...
متن کاملA Comparative Analysis of General Defect Proneness in the Competing Software Systems
Predicting defect-prone software components are an economically important activity and so has received a good deal of attention. The main objective of this software defect-proneness is to propose and evaluate a general framework for defect prediction in software that supports 1) unbiased and 2) comprehensive comparisons between competing prediction systems. Generally, before building defect pre...
متن کاملSoftware defect prediction using relational association rule mining
This paper focuses on the problem of defect prediction, a problem of major importance during software maintenance and evolution. It is essential for software developers to identify defective software modules in order to continuously improve the quality of a software system. As the conditions for a software module to have defects are hard to identify, machine learning based classification models...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEICE Transactions
دوره 95-D شماره
صفحات -
تاریخ انتشار 2012