An Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification
Authors
Abstract:
In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature space without FS increases the computational cost which is a function of the length of the vector, and also, it helps to remove irrelevant attributes. The general approach in this paper combines the hybrid of Flower Pollination Algorithm (FPA) with Ada-Boost algorithm. The FPA is used for FS and the Ada-Boost is used for classification of text documents. Tests were conducted on Reuters-21578, WEBKB and CADE 12 datasets. The results show that the hybrid model has higher detection accuracy in FS compared with Ada-Boost algorithm with model. And comparisons are indicative of higher detection accuracy of the proposed model compared with KNN-K-Means, NB-K-Means and learning models.
similar resources
An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
full textAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
full textAn Improvement in Support Vector Machines Algorithm with Imperialism Competitive Algorithm for Text Documents Classification
Due to the exponential growth of electronic texts, their organization and management requires a tool to provide information and data in search of users in the shortest possible time. Thus, classification methods have become very important in recent years. In natural language processing and especially text processing, one of the most basic tasks is automatic text classification. Moreover, text ...
full textCategory Discrimination Based Feature Selection Algorithm in Chinese Text Classification
How to improve the classification precision is a major issue in the field of Chinese text classification. The tf-idf algorithm is a classic and widely-used feature selection algorithm based on VSM. But the traditional tf-idf algorithm neglects the feature term’s distribution inside category and among categories, which causes many unreasonable selective results. This paper makes an improvement t...
full textAn improved global feature selection scheme for text classification
Feature selection is known as a good solution to the high dimensionality of the feature space and mostly preferred feature selection methods for text classification are filter-based ones. In a common filter-based feature selection scheme, unique scores are assigned to features depending on their discriminative power and these features are sorted in descending order according to the scores. Then...
full textMy Resources
Journal title
volume 9 issue 1
pages 29- 40
publication date 2018-02-01
By following a journal you will be notified via email when a new issue of this journal is published.
Hosted on Doprax cloud platform doprax.com
copyright © 2015-2023