Uncertainty Based Under-Sampling for Learning Naive Bayes Classifiers Under Imbalanced Data Sets
نویسندگان
چکیده
منابع مشابه
Cluster-based under-sampling approaches for imbalanced data distributions
For classification problem, the training data will significantly influence the classification accuracy. However, the data in real-world applications often are imbalanced class distribution, that is, most of the data are in majority class and little data are in minority class. In this case, if all the data are used to be the training data, the classifier tends to predict that most of the incomin...
متن کاملHierarchical Naive Bayes Classifiers for uncertain data
In experimental sciences many classification problems deal with variables with replicated measurements. In this case the replicates are usually summarized by their mean or median. However, such choice does not consider the information about the uncertainty associated with the measurements, thus potentially leading to over or underestimate the probability associated to each classification. In th...
متن کاملBudgeted Learning of Naive-Bayes Classifiers
There is almost always a cost associated with acquiring training data. We consider the sit uation where the learner, with a fixed budget, may 'purchase' data during training. In par ticular, we examine the case where observ ing the value of a feature of a training exam ple has an associated cost, and the total cost of all feature values acquired during train ing must remain less than this ...
متن کاملLearning Classifiers from Imbalanced, Only Positive and Unlabeled Data Sets
In this report, I presented my results to the tasks of 2008 UC San Diego Data Mining Contest. This contest consists of two classification tasks based on data from scientific experiment. The first task is a binary classification task which is to maximize accuracy of classification on an evenly-distributed test data set, given a fully labeled imbalanced training data set. The second task is also ...
متن کاملCUSBoost: Cluster-based Under-sampling with Boosting for Imbalanced Classification
Class imbalance classification is a challenging research problem in data mining and machine learning, as most of the real-life datasets are often imbalanced in nature. Existing learning algorithms maximise the classification accuracy by correctly classifying the majority class, but misclassify the minority class. However, the minority class instances are representing the concept with greater in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2020
ISSN: 2169-3536
DOI: 10.1109/access.2019.2961784