Genetic Algorithm Based Over-Sampling with DNN in Classifying the Imbalanced Data Distribution Problem
نویسندگان
چکیده
Objective: Data imbalance exists in many real-life applications. In the imbalanced datasets, minority class data creates a wrong inference during classification that leads to more misclassification. More research has been done past solve this issue, but as of now there is no global working solution found do efficient classification. After analyzing various existing literatures, it proposed minimize misclassification through genetic based oversampling and deep neural network (DNN) classifier. Method: method synthetic samples are generated on algorithm. Initial populations for algorithm using Gaussian weight initialization technique fittest individual from population selected by Euclidean distance further processing generate double size dataset classified with DNN. Findings: The performance oversampled training DNN Classifier compared C4.5 Support Vector Machine (SVM) classifiers classifier outperforms other two classifiers. SMOTE ADASYN considered comparison. It approach approaches. also proved experiment reduced statistically significant comparatively better. Novelty: generation initialization, sample selection measure, reduce novelty work. Keywords: Genetic algorithm; Gauss initialization; SMOTE; ADASYN; Imbalanced data; Classification
منابع مشابه
Deep Over-sampling Framework for Classifying Imbalanced Data
Class imbalance is a challenging issue in practical classification problems for deep learning models as well as traditional models. Traditionally successful countermeasures such as synthetic oversampling have had limited success with complex, structured data handled by deep learning models. In this paper, we propose Deep Over-sampling (DOS), a framework for extending the synthetic over-sampling...
متن کاملBorderline over-sampling for imbalanced data classification
Traditional classification algorithms, in many times, perform poorly on imbalanced data sets in which some classes are heavily outnumbered by the remaining classes. For this kind of data, minority class instances, which are usually much more of interest, are often misclassified. The paper proposes a method to deal with them by changing class distribution through oversampling at the borderline b...
متن کاملClassifying Severely Imbalanced Data
Learning from data with severe class imbalance is difficult. Established solutions include: under-sampling, adjusting classification threshold, and using an ensemble. We examine the performance of combining these solutions to balance the sensitivity and specificity for binary classifications, and to reduce the MSE score for probability estimation.
متن کاملThe Imbalanced Training Sample Problem: Under or over Sampling?
The problem of imbalanced training sets in supervised pattern recognition methods is receiving growing attention. Imbalanced training sample means that one class is represented by a large number of examples while the other is represented by only a few. It has been observed that this situation, which arises in several practical domains, may produce an important deterioration of the classificatio...
متن کاملthe algorithm for solving the inverse numerical range problem
برد عددی ماتریس مربعی a را با w(a) نشان داده و به این صورت تعریف می کنیم w(a)={x8ax:x ?s1} ، که در آن s1 گوی واحد است. در سال 2009، راسل کاردن مساله برد عددی معکوس را به این صورت مطرح کرده است : برای نقطه z?w(a)، بردار x?s1 را به گونه ای می یابیم که z=x*ax، در این پایان نامه ، الگوریتمی برای حل مساله برد عددی معکوس ارانه می دهیم.
15 صفحه اولذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Indian journal of science and technology
سال: 2023
ISSN: ['0974-5645', '0974-6846']
DOI: https://doi.org/10.17485/ijst/v16i8.863