Classification of Imbalanced Data by Combining the Complementary Neural Network and SMOTE Algorithm

نویسندگان

  • Piyasak Jeatrakul
  • Kevin Kok Wai Wong
  • Lance Chun Che Fung
چکیده

In classification, when the distribution of the training data among classes is uneven, the learning algorithm is generally dominated by the feature of the majority classes. The features in the minority classes are normally difficult to be fully recognized. In this paper, a method is proposed to enhance the classification accuracy for the minority classes. The proposed method combines Synthetic Minority Over-sampling Technique (SMOTE) and Complementary Neural Network (CMTNN) to handle the problem of classifying imbalanced data. In order to demonstrate that the proposed technique can assist classification of imbalanced data, several classification algorithms have been used. They are Artificial Neural Network (ANN), kNearest Neighbor (k-NN) and Support Vector Machine (SVM). The benchmark data sets with various ratios between the minority class and the majority class are obtained from the University of California Irvine (UCI) machine learning repository. The results show that the proposed combination techniques can improve the performance for the class imbalance problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An experimental comparison of classification algorithm performances for highly imbalanced datasets

Imbalanced learning data often emerges during the process of the knowledge discovery in data and presents a significant challenge for data mining methods. In this paper we investigate the influence of class imbalanced data on: artificial intelligence methods i.e. neural networks and support vector machine and on classical classification methods represented by RIPPER and Naïve Bayes classifier. ...

متن کامل

Preprocessing noisy imbalanced datasets using SMOTE enhanced with fuzzy rough prototype selection

The Synthetic Minority Over Sampling TEchnique (SMOTE) is a widely used technique to balance imbalanced data. In this paper we focus on improving SMOTE in the presence of class noise. Many improvements of SMOTE have been proposed, mostly cleaning or improving the data after applying SMOTE. Our approach differs from these approaches by the fact that it cleans the data before applying SMOTE, such...

متن کامل

Oversampling Method for Imbalanced Classification

Classification problem for imbalanced datasets is pervasive in a lot of data mining domains. Imbalanced classification has been a hot topic in the academic community. From data level to algorithm level, a lot of solutions have been proposed to tackle the problems resulted from imbalanced datasets. SMOTE is the most popular data-level method and a lot of derivations based on it are developed to ...

متن کامل

A combined SMOTE and PSO based RBF classifier for two-class imbalanced problems

This contribution proposes a powerful technique for two-class imbalanced classification problems by combining the synthetic minority over-sampling technique (SMOTE) and the particle swarm optimisation (PSO) aided radial basis function (RBF) classifier. In order to enhance the significance of the small and specific region belonging to the positive class in the decision region, the SMOTE is appli...

متن کامل

Cystoscopic Image Classification Based on Combining MLP and GA

In the past three decades, the use of smart methods in medical diagnostic systems has attracted the attention of many researchers. However, no smart activity has been provided in the field of medical image processing for diagnosis of bladder cancer through cystoscopy images despite the high prevalence in the world. In this paper, a multilayer neural network was applied to clas...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010