The class imbalance problem in pattern classification and learning
نویسندگان
چکیده
It has been observed that class imbalance (that is, significant differences in class prior probabilities) may produce an important deterioration of the performance achieved by existing learning and classification systems. This situation is often found in real-world data describing an infrequent but important case. In the present work, we perform a review of the most important research lines on this topic and point out several directions for further investigation.
منابع مشابه
A Novel One Sided Feature Selection Method for Imbalanced Text Classification
The imbalance data can be seen in various areas such as text classification, credit card fraud detection, risk management, web page classification, image classification, medical diagnosis/monitoring, and biological data analysis. The classification algorithms have more tendencies to the large class and might even deal with the minority class data as the outlier data. The text data is one of t...
متن کاملBreast Cancer Diagnosis from Perspective of Class Imbalance
Introduction: Breast cancer is the second cause of mortality among women. Early detection is the only rescue to reduce the risk of breast cancer mortality. Traditional methods cannot effectively diagnose tumor since they are based on the assumption of well-balanced dataset.. However, a hybrid method can help to alleviate the two-class imbalance problem existing in the ...
متن کاملModified CLPSO-based fuzzy classification System: Color Image Segmentation
Fuzzy segmentation is an effective way of segmenting out objects in images containing both random noise and varying illumination. In this paper, a modified method based on the Comprehensive Learning Particle Swarm Optimization (CLPSO) is proposed for pixel classification in HSI color space by selecting a fuzzy classification system with minimum number of fuzzy rules and minimum number of incorr...
متن کاملLearning pattern classification tasks with imbalanced data sets
This chapter is concerned with the class imbalance problem, which has been recognised as a crucial problem in machine learning and data mining. The problem occurs when there are significantly fewer training instances of one class compared to another class.
متن کاملMMDT: Multi-Objective Memetic Rule Learning from Decision Tree
In this article, a Multi-Objective Memetic Algorithm (MA) for rule learning is proposed. Prediction accuracy and interpretation are two measures that conflict with each other. In this approach, we consider accuracy and interpretation of rules sets. Additionally, individual classifiers face other problems such as huge sizes, high dimensionality and imbalance classes’ distribution data sets. This...
متن کامل