text documents classification

نتایج جستجو برای: text documents classification

تعداد نتایج: 694633 فیلتر نتایج به سال:

A Study on Analysis of SMS Classification Using Document Frequency Thresold

2012

Recent years, feature selection is chief concern in text classification. A major characteristic in text classification is the high dimensionality of the feature space. Therefore, feature selection is strongly considered as one of the crucial part in text document categorization. Selecting the best features to represent documents can reduce the dimensionality of feature space hence increase the ...

متن کامل

Enhanced Information Retrieval from Narrative German-language Clinical Text Documents using Automated Document Classification

Journal: :Studies in health technology and informatics 2008

Stephan Spat Bruno Cadonna Ivo Rakovac Christian Gütl Hubert Leitner Günther Stark Peter Beck

The amount of narrative clinical text documents stored in Electronic Patient Records (EPR) of Hospital Information Systems is increasing. Physicians spend a lot of time finding relevant patient-related information for medical decision making in these clinical text documents. Thus, efficient and topical retrieval of relevant patient-related information is an important task in an EPR system. This...

متن کامل

Text Mining Business Policy Documents

Journal: :International Journal of Business Intelligence Research 2020

متن کامل

Recognizing Documents versus Meta-Documents by Tree Kernel Learning

2015

Boris A. Galitsky Nina Lebedeva

The problem of classifying text with respect to metalanguage and language object patterns is formulated and its application areas are proposed. Examples of metalanguage patterns in text are foreign language grammar lessons and tutorials on how to write engineering documents. The method targets the text classification tasks where keyword statistics is insufficient do distinguish between such abs...

متن کامل

Text Categorization of Commercial Web Pages

2007

E. Binaghi

In this paper we describe a new on-line document categorization strategy that can be integrated within Web applications. A salient aspect is the use of neural learning in both representation and classification tasks. Within text documents conceived as images, the regions of interest (RoI) containing information meaningful for categorization are identified with the support of a supervised neural...

متن کامل

Dataless Text Classification with Descriptive LDA

2015

Xingyuan Chen Yunqing Xia Peng Jin John A. Carroll

Manually labeling documents for training a text classifier is expensive and time-consuming. Moreover, a classifier trained on labeled documents may suffer from overfitting and adaptability problems. Dataless text classification (DLTC) has been proposed as a solution to these problems, since it does not require labeled documents. Previous research in DLTC has used explicit semantic analysis of W...

متن کامل

Prototype of a Medical Information Retrieval System for Electronic Patient Records Finding relevant information in clinical text documents

2007

Stephan Spat STEPHAN SPAT

The Steiermärkische Krankenanstalten Ges.m.b.H. (KAGes) conducted the roll-out of an electronic patient record (EPR) system in 2004. This system contains an increasing amount of unstructured clinical text documents in German language. In order to facilitate the patient-related medical decision-making for physicians, this diploma thesis analyses and implements methods retrieving relevant medical...

متن کامل

Web Content Categorization Using Link Information

2006

Zoltán Gyöngyi Hector Garcia-Molina Jan Pedersen

Document categorization is one of the foundational problems in (web) information retrieval. Even though web documents are hyperlinked, most proposed classification techniques take little advantage of the link structure and rely primarily on text features, as it is not immediately clear how to make link information intelligible to supervised machine learning algorithms. This paper introduces a l...

متن کامل

Text Classification: Forming Candidate Key-Phrases from Existing Shorter Ones

2007

Nikitas N. Karanikolas Christos Skourlas N. N. Karanikolas

The hard problem of the Text Classification usually has various aspects and potential solutions. In this paper, two main research directions for narrative documents’ classification are considered. The first one is based on data mining and rule induction techniques, while the second combines the traditional Text Retrieval techniques (use of the vector space model,

متن کامل

Text Identification in Noisy Document Images Using Markov Random Field

2003

Yefeng Zheng Huiping Li David S. Doermann

In this paper we address the problem of the identification of text from noisy documents. We segment and identify handwriting from machine printed text because 1) handwriting in a document often indicates corrections, additions or other supplemental information that should be treated differently from the main or body content, and 2) the segmentation and recognition techniques for machine printed...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید