Creating Ensembles of Classifiers
نویسندگان
چکیده
Ensembles of classifiers offer promise in increasing overall classification accuracy. The availability of extremely large datasets has opened avenues for application of distributed and/or parallel learning to efficiently learn models of them. In this paper, distributed learning is done by training classifiers on disjoint subsets of the data. We examine a random partitioning method to create disjoint subsets and propose a more intelligent way of partitioning into disjoint subsets using clustering. It was observed that the intelligent method of partitioning generally performs better than random partitioning for our datasets. In both methods a significant gain in accuracy may be obtained by applying bagging to each of the disjoint subsets, creating multiple diverse classifiers. The significance of our finding is that a partition strategy for even small/moderate sized datasets when combined with bagging can yield better performance than applying a single learner using the entire dataset.
منابع مشابه
A New Classifiers Ensemble Method for Handwritten Pen Digits Classification
Recent researches have shown that ensembles of classifiers have more accuracy than a single classifier. Baging, boosting and error correcting output codes (ECOC) are most common ways for creating combination of classifiers. In this paper a new method for ensemble of classifiers has been introduced and performance of this method examined by applying to handwritten pen digits dataset. Experimenta...
متن کاملHeterogeneous Ensemble Classification
The problem of multi-class classification is explored using heterogeneous ensemble classifiers. Heterogeneous ensembles classifiers are defined as ensembles, or sets, of classifier models created using more than one type of classification algorithm. For example, the outputs of decision tree classifiers could be combined with the outputs of support vector machines (SVM) to create a heterogeneous...
متن کاملParallel computation of kernel density estimates classifiers and their ensembles
Nonparametric supervised classifiers are interesting because they do not require distributional assumptions for the class conditional density, such as normality or equal covariance. However their use is not widespread because it takes a lot of time to compute them due to the intensive use of the available data. On the other hand bundling classifiers to produce a single one, known as an ensemble...
متن کاملFeature Selection for Ensembles of Simple Bayesian Classifiers
A popular method for creating an accurate classifier from a set of training data is to train several classifiers, and then to combine their predictions. The ensembles of simple Bayesian classifiers have traditionally not been a focus of research. However, the simple Bayesian classifier has much broader applicability than previously thought. Besides its high classification accuracy, it also has ...
متن کاملCreating diverse nearest-neighbour ensembles using simultaneous metaheuristic feature selection
The nearest-neighbour (1NN) classifier has long been used in pattern recognition, exploratory data analysis, and data mining problems. A vital consideration in obtaining good results with this technique is the choice of distance function, and correspondingly which features to consider when computing distances between samples. In recent years there has been an increasing interest in creating ens...
متن کامل