نتایج جستجو برای: text documents classification

تعداد نتایج: 694633  

Journal: :Expert Syst. Appl. 2006
Amy J. C. Trappey Fu-Chiang Hsu Charles V. Trappey Chia-I Lin

In order to process large numbers of explicit knowledge documents such as patents in an organized manner, automatic document categorization and search are required. In this paper, we develop a document classification and search methodology based on neural network technology that helps companies manage patent documents more effectively. The classification process begins by extracting key phrases...

2005
Peter Andras Olusola Idowu

Correct and efficient text classification is a major challenge in today’s world of rapidly increasing amount of accessible electronic text data. Kohonen networks have been applied to document classification with comparable success to other document clustering methods. An important challenge is to devise text similarity metrics that can improve the performance of text classification Kohonen netw...

Journal: :Interact. Techn. Smart Edu. 2008
Majed Sanan Mahmoud Rammal Khaldoun Zreik

Purpose – Recently, classification of Arabic documents is a real problem for juridical centers. In this case, some of the Lebanese official journal documents are classified, and the center has to classify new documents based on these documents. This paper aims to study and explain the useful application of supervised learning method on Arabic texts using N-gram as an indexing method (n1⁄4 3). D...

2006
Youngsoo Kim Taekyong Nam Dongho Won

The openness of the Web allows any user to access almost any type of information. However, some information, such as adult content, is not appropriate for all users, notably children. Additionally for adults, some contents included in abnormal porn sites can do ordinary people’s mental health harm. In this paper, we propose an efficient 2-way text filter for blocking harmful web documents and a...

2009
Ludovic Denoyer

INTRODUCTION Document classification developed over the last ten years, using techniques originating from the pattern recognition and machine learning communities. All these methods do operate on flat text representations where word occurrences are considered independents. The recent paper (Sebastiani, 2002) gives a very good survey on textual document classification. With the development of st...

2006
Tao Peng Fengling He Wanli Zuo

Automatic text classification is one of the most important tools in Information Retrieval. As the traditional methods for text classification cannot find the best feature set, the GA is applied to the feature selection because it can get the global optimal solution. This paper presents a novel text classifier from positive and unlabeled documents based on GA. Firstly, we identify reliable negat...

2009
R. Dinesh B. S. Harish D. S. Guru S. Manjunath

In this paper we propose a new method of classifying text documents. Unlike conventional vector space models, the proposed method preserves the sequence of term occurrence in a document. The term sequence is effectively preserved with the help of a novel datastructure called ‘Status Matrix’. Further the corresponding classification technique has been proposed for efficient classification of tex...

1990
David D. Lewis

The way in which text is represented has a strong impact on the performance of text classification (retrieval and categorization) systems. We discuss the operation of text classification systems, introduce a theoretical model of how text representation impacts their performance, and describe how the performance of text classification systems is evaluated. We then present the results of an exper...

1998
William W. Cohen

This paper shows that the accuracy of learned text classi ers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classi cation problems obtaining training labels is expensive, while large quantities of unlabeled documents are readily available. We introduce an algorithm for learning from lab...

2006
Aaron M. Cohen

Automated document classification can be a valuable tool for biomedical tasks that involve large amounts of text. However, in biomedicine, documents that have the desired properties are often rare, and special methods are usually required to address this issue. We propose and evaluate a method of classifying biomedical text documents, optimizing for utility when misclassification costs are high...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید