Non-Probabilistic K-Nearest Neighbor for Automatic News Classification Model with K-Means Clustering

نویسنده

  • AKANKSHA GUPTA
چکیده

The news classification is the branch of text classification or text mining. The researchers have already done a lot of work on the text classification models with different approaches. The news works has to be classified in the form of various categories such as sports, political, technology, business, science, health, regional and many other similar categories. The researchers have already worked with many supervised and unsupervised methods for the purpose of news classification. The supervised models have been found more efficient for the purpose of news classification. The k-means algorithm has been used for the classification of the keywords into the multiple groups. The k-nearest neighbor (kNN) classification algorithm has been utilized to estimate the category of the news in the processing. The proposed model has been recorded with the average accuracy of the 93.28% obtained after averaging the accuracy of all test cases, which higher than the previous best performer naïve bayes and SVM based news classifier, which has posted nearly 83.5% of accuracy for classifying the news data. The proposed model has been tested with the 91%, 95%, 90% and 97% of the accuracy over the input test cases of S1, S2, S3 and S4 respectively, which higher than all of the existing models. Hence the proposed model can be declared as the better solution than the previous classification models. KEYWORDS—News classification, k-nearest neighbor, k-means classification, support vector machine, N-gram analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FUZZY K-NEAREST NEIGHBOR METHOD TO CLASSIFY DATA IN A CLOSED AREA

Clustering of objects is an important area of research and application in variety of fields. In this paper we present a good technique for data clustering and application of this Technique for data clustering in a closed area. We compare this method with K-nearest neighbor and K-means.  

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

Identification of selected monogeneans using image processing, artificial neural network and K-nearest neighbor

Abstract Over the last two decades, improvements in developing computational tools made significant contributions to the classification of biological specimens` images to their correspondence species. These days, identification of biological species is much easier for taxonomist and even non-taxonomists due to the development of automated computer techniques and systems.  In this study, we d...

متن کامل

Analytical Review of the News Data Classification Methods with Multivariate Classification Attributes

-The new classification has been emerged as the important sub-branch of the data mining. A lot of work has been already done on the news classification with variety of classifiers and feature descriptors. A number of news classification projects are working on the real-time systems in existence today. The news classification is the important part of the online news portals. The online news port...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016