نتایج جستجو برای: synthetic minority over sampling technique

تعداد نتایج: 1974657  

Journal: :Decision Analytics 2015
Syed Tanveer Jishan Raisul Islam Rashu Naheena Haque Rashedur M. Rahman

There is a perpetual elevation in demand for higher education in the last decade all over the world; therefore, the need for improving the education system is imminent. Educational data mining is a newly-visible area in the field of data mining and it can be applied to better understanding the educational systems in Bangladesh. In this research, we present how data can be preprocessed using a d...

Journal: :Appl. Soft Comput. 2011
Francisco Fernández-Navarro César Hervás-Martínez Manuel Cruz-Ramírez Pedro Antonio Gutiérrez Antonio Valero

In this paper, q-Gaussian Radial Basis Functions are presented as an alternative to Gaussian Radial Basis Function. This model is based on q-Gaussian distribution, which parametrizes the Gaussian distribution by adding a new parameter q. The q-Gaussian Radial Basis Function allows different Radial Basis Functions to be represented by updating the new parameter q. For example, when the q-Gaussia...

2015
Zhipeng Xie Liyang Jiang Tengju Ye Xiaoli Li

Imbalanced class distribution is a challenging problem in many real-life classification problems. Existing synthetic oversampling do suffer from the curse of dimensionality because they rely heavily on Euclidean distance. This paper proposed a new method, called Minority Oversampling Technique based on Local Densities in Low-Dimensional Space (or MOT2LD in short). MOT2LD first maps each trainin...

2009
Sandeep Chandana Henry Leung Kiril Trpkov

A novel technique of automatically selecting the best pairs of features and sampling techniques to predict the stage of prostate cancer is proposed in this study. The problem of class imbalance, which is prominent in most medical data sets is also addressed here. Three feature subsets obtained by the use of principal components analysis (PCA), genetic algorithm (GA) and rough sets (RS) based ap...

2007
Jingrui He Jaime G. Carbonell

Rare category detection is an open challenge for active learning, especially in the de-novo case (no labeled examples), but of significant practical importance for data mining e.g. detecting new financial transaction fraud patterns, where normal legitimate transactions dominate. This paper develops a new method for detecting an instance of each minority class via an unsupervised local-density-d...

Journal: :Artif. Intell. Research 2017
Chun Gui

Class-imbalanced datasets are common in the field of mobile Internet industry. We tested three kinds of feature selection techniques-Random Forest (RF), Relative Weight (RW) and Standardized Regression Coefficients (SRC); three kinds of balance methods-over-sampling (OS), under-sampling (US) and synthetic minority over-sampling (SMOTE); a widely used classification method-RF. The combined model...

2008
Jorge de la Calleja Olac Fuentes Jesús González

We introduce a method to deal with the problem of learning from imbalanced data sets, where examples of one class significantly outnumber examples of other classes. Our method selects minority examples from misclassified data given by an ensemble of classifiers. Then, these instances are over-sampled to create new synthetic examples using a variant of the well-known SMOTE algorithm. To build th...

2007
Cristiane Neri Nobre J. Miguel Ortega Antônio de Pádua Braga

An important task in the area of gene discovery is the correct prediction of the translation initiation site (TIS). The TIS can correspond to the first AUG, but this is not always the case. This task can be modeled as a classification problem between positive (TIS) and negative patterns. Here we have used Support Vector Machine working with data processed by the class balancing method called Sm...

2014
Chumphol Bunkhumpornpat Krung Sinapiromsaran

In classification tasks, imbalance data causes the inadequate predictive performance of a tiny minority class because the decision boundary determined by trivial classifiers tends to be biased toward a huge majority class. For handling the class imbalance problem, overand undersampling are applied at the data level. Over-sampling duplicates or synthesizes instances into a minority class. Althou...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید