An unsupervised self-organizing learning with support vector ranking for imbalanced datasets

نویسندگان

  • Yok-Yen Nguwi
  • Siu-Yeung Cho
چکیده

The aim of computational learning algorithm is to establish grounds that work for any types of data, once and for all. However, majority of the classifiers have their base from balanced datasets. This paper discusses the issues related to imbalanced data distribution problem and the common strategy to deal with imbalance datasets. We propose a model capable of handling imbalance datasets well in which other typical classifiers fail to do so. The model adopted a derivation of support vector machines in selecting variables so that the problem of imbalanced data distribution can be relaxed. Then, we used an Emergent Self-Organizing Map (ESOM) to cluster the ranker features so as to provide clusters for unsupervised classification. This work progresses by examining the efficiency of the model in evaluating imbalanced datasets. Experimental results show that the criterion based on weight vector derivative achieves good results and performs consistently well over imbalance datasets. In general, our approach outperforms other classification methods which are unable to handle the imbalanced data distribution in the testing datasets. 2010 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Facial Emotion Ranking Under Imbalanced Conditions

The aim of emotion recognition is to establish grounds that work for different types of emotions. However, majority of the classifiers have their base from balanced datasets. There are few works that attempts to address how to approach facial emotion recognition under imbalanced condition. This paper discusses the issues related to imbalanced data distribution problem and the common strategy to...

متن کامل

Using Self-organizing Maps for Binary Classification with Highly Imbalanced Datasets

Highly imbalanced datasets occur in domains like fraud detection, fraud prediction, and clinical diagnosis of rare diseases, among others. These datasets are characterized by the existence of a prevalent class (e.g. legitimate sellers) while the other is relatively rare (e.g. fraudsters). Although small in proportion, the observations belonging to the minority class can be of a crucial importan...

متن کامل

Balance Support Vector Machines Locally Using the Structural Similarity Kernel

A structural similarity kernel is presented in this paper for SVM learning, especially for learning with imbalanced datasets. Kernels in SVM are usually pairwise, comparing the similarity of two examples only using their feature vectors. By building a neighborhood graph (kNN graph) using the training examples, we propose to utilize the similarity of linking structures of two nodes as an additio...

متن کامل

Signal Classifiers Using Self-organizing Maps: Performance and Robustness

This paper explores the use of self-organizing maps as a mechanism for performing unsupervised learning for signal classification. Approaches using unsupervised learning have a key advantage over traditional approaches that utilize neural networks and support vector machines because they do not require a training phase. We develop signal classifiers using self-organizing maps and explore their ...

متن کامل

Ischemia detection with a self-organizing map supplemented by supervised learning

The problem of maximizing the performance of the detection of ischemia episodes is a difficult pattern classification problem. The motivation for developing the supervising network self-organizing map (sNet-SOM) model is to exploit this fact for designing computationally effective solutions both for the particular ischemic detection problem and for other applications that share similar characte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Expert Syst. Appl.

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2010