A Modular k-Nearest Neighbor Classification Method for Massively Parallel Text Categorization
نویسندگان
چکیده
This paper presents a Min-Max modular k-nearest neighbor (M-k-NN) classification method for massively parallel text categorization. The basic idea behind the method is to decompose a large-scale text categorization problem into a number of smaller two-class subproblems and combine all of the individual modular k-NN classifiers trained on the smaller two-class subproblems into an M-k-NN classifier. Our experiments in text categorization demonstrate that M-k-NN is much faster than conventional k-NN, and meanwhile the classification accuracy of M-k-NN is slightly better than that of the conventional k-NN. In practical, M-k-NN has intimate relationship with high order k-NN algorithm; therefore, in theoretical sense, the reliability of M-k-NN has been supported to some extend.
منابع مشابه
An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملApplication of k Nearest Neighbor on Feature Projections Classi er to Text Categorization
This paper presents the results of the application of an instance based learning algorithm k Nearest Neighbor Method on Fea ture Projections k NNFP to text categorization and compares it with k Nearest Neighbor Classi er k NN k NNFP is similar to k NN ex cept it nds the nearest neighbors according to each feature separately Then it combines these predictions using a majority voting This prop er...
متن کاملApplication of k - Nearest Neighbor on FeatureProjections Classi er to Text
This paper presents the results of the application of an instance-based learning algorithm k-Nearest Neighbor Method on Feature Projections (k-NNFP) to text categorization and compares it with k-Nearest Neighbor Classiier (k-NN). k-NNFP is similar to k-NN except it nds the nearest neighbors according to each feature separately. Then it combines these predictions using a majority voting. This pr...
متن کاملNeighbor-weighted K-nearest neighbor for unbalanced text corpus
Text categorization or classification is the automated assigning of text documents to pre-defined classes based on their contents. Many of classification algorithms usually assume that the training examples are evenly distributed among different classes. However, unbalanced data sets often appear in many practical applications. In order to deal with uneven text sets, we propose the neighbor-wei...
متن کامل