نتایج جستجو برای: text documents classification

تعداد نتایج: 694633  

2006
Ioannis Antonellis Christos Bouras Vassilis Poulopoulos Anastasios Zouzias

We explore scalability issues of the text classification problem where using (multi)labeled training documents we try to build classifiers that assign documents into classes permitting classification in multiple classes. A new class of classification problems, called ‘scalable’ is introduced that models many problems from the area of Web mining. The property of scalability is defined as the abi...

2015
Suhad A. Yousif Islam Elkabani Rached Zantout

A massive amount of documents are being posted online every minute. The task of document classification requires extensive background work on the content of documents, where keyword-based matching alone may not be sufficient. Much research has been carried out in several languages that has revealed significant results. However, Arabic documents still pose a great challenge due to the nature of ...

2006
Laila Khreisat

This paper presents the results of classifying Arabic text documents using the N-gram frequency statistics technique employing a dissimilarity measure called the “Manhattan distance”, and Dice’s measure of similarity. The Dice measure was used for comparison purposes. Results show that N-gram text classification using the Dice measure outperforms classification using the Manhattan measure.

Journal: :Proceedings of the Institute for System Programming of the RAS 2020

2011
Seyyed Mohammad Reza Farshchi

The assignment of natural language texts to one or more predefined categories based on their content – is an important component in many information organization and management tasks. This research proposes a novel approach for documents classification with using novel method that combined competitive self organizing neural text categorizer with new vectors that we called, string vectors. Even ...

2002
Hyo-Jung Oh Moon-Soo Chang Myung-Gil Jang Sung Hyon Myaeng

With the exponential growth of information on the WWW, it is becoming increasingly difficult to find and organize relevant documents. Automatic text classification has been considered as a solution to the problem with its focus mostly on the subject or content of text [1]. Recently, researchers have realized that user information needs are not just based on the subject of a document but also on...

Journal: :Applied sciences 2022

With the proliferation of mobile devices, amount social media users and online news articles are rapidly increasing, text information is accumulating as big data. As spatio-temporal becomes more important, research on extracting spatiotemporal from data utilizing it for event analysis being actively conducted. However, if that does not describe core subject a document extracted, rather difficul...

2001
Aixin Sun Ee-Peng Lim

Hierarchical Classification refers to assigning of one or more suitable categories from a hierarchical category space to a document. While previous work in hierarchical classification focused on virtual category trees where documents are assigned only to the leaf categories, we propose a topdown level-based classification method that can classify documents to both leaf and internal categories. ...

2003
Núria Bel Cornelis H. A. Koster Marta Villegas

This article deals with the problem of Cross-Lingual Text Categorization (CLTC), which arises when documents in different languages must be classified according to the same classification tree. We describe practical and cost-effective solutions for automatic Cross-Lingual Text Categorization, both in case a sufficient number of training examples is available for each new language and in the cas...

2012
Shweta C. Dharmadhikari Maya Ingle Parag Kulkarni

Classifying text data has been an active area of research for a long time. Text document is multifaceted object and often inherently ambiguous by nature. Multi-label learning deals with such ambiguous object. Classification of such ambiguous text objects often makes task of classifier difficult while assigning relevant classes to input document. Traditional single label and multi class text cla...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید