نتایج جستجو برای: text documents classification

تعداد نتایج: 694633  

2014
Vishwanath Bijalwan Vinay Kumar Pinki Kumari Jordan Pascual

Text Categorization (TC), also known as Text Classification, is the task of automatically classifying a set of text documents into different categories from a predefined set. If a document belongs to exactly one of the categories, it is a single-label classification task; otherwise, it is a multi-label classification task. TC uses several tools from Information Retrieval (IR) and Machine Learni...

Journal: :CoRR 2014
Vishwanath Bijalwan Pinki Kumari Jordan Pascual Vijay Bhaskar Semwal

Text Categorization (TC), also known as Text Classification, is the task of automatically classifying a set of text documents into different categories from a predefined set. If a document belongs to exactly one of the categories, it is a single-label classification task; otherwise, it is a multi-label classification task. TC uses several tools from Information Retrieval (IR) and Machine Learni...

2007
Stephan Spat Bruno Cadonna Ivo Rakovac Christian Gütl Hubert Leitner Günther Stark Peter Beck

and Objective Nearly at every patient visit medical documents are produced and stored in a medical record, often in unstructured form as free text. Growing amount of stored documents increases the need for effective and timely retrieval of information. We developed a multi-label classification system to categorize German language free text medical documents (e.g. discharge letters, clinical fin...

1998
Kamal Nigam Andrew McCallum Sebastian Thrun Tom M. Mitchell

In many important text classification problems, acquiring class labels for training documents is costly, while gathering large quantities of unlabeled data is cheap. This paper shows that the accuracy of text classifiers trained with a small number of labeled documents can be improved by augmenting this small training set with a large pool of unlabeled documents. We present a theoretical argume...

,

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

Journal: :Comput. Syst. Sci. Eng. 2009
Ioannis Antonellis Christos Bouras Vassilis Poulopoulos

We consider scalability issues of the text classification problem where by using (multi)-labeled training documents, we try to build classifiers that assign documents into classes permitting classification in multiple classes. A new class of classification problems; called ‘scalable’, is introduced, with applications on web mining. Scalable classification utilizes newly classified instances in ...

2014
Yangqiu Song Dan Roth

In this paper, we systematically study the problem of dataless hierarchical text classification. Unlike standard text classification schemes that rely on supervised training, dataless classification depends on understanding the labels of the sought after categories and requires no labeled data. Given a collection of text documents and a set of labels, we show that understanding the labels can b...

2006
Lucian N. VINTAN

1 Introduction Most data collections from real world are in text format. Those data are considered semi structured data because they have a small organized structure. Modeling and implementing on semi structured data from recent data bases grows continually in the last years. More over, information retrieval applications, as indexing methods of text documents, have been adapted in order to work...

2012
I. Morariu Lucian N. Vintan Volker Tresp

Text categorization is the problem of classifying text documents into a set of predefined classes. In this paper, we investigated three approaches to build a meta-classifier in order to increase the classification accuracy. The basic idea is to learn a metaclassifier to optimally select the best component classifier for each data point. The experimental results show that combining classifiers c...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید