نتایج جستجو برای: text documents

تعداد نتایج: 222232  

Journal: :International Journal of Advanced Computer Science and Applications 2019

Aim: Automatic information retrieval is based on the assumption that texts contain content or structural elements that can be used in word sense disambiguation and thereby improving the effectiveness of the results retrieved. Homographs are among the words requiring sense disambiguation. Depending on their roles and positions in texts, homograph contexts could be divided to different types, wit...

Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...

1999
Hakan Ancin Anoop K. Bhattacharjya

The vast majority of copied documents generally consist of text, and the copy quality mostly depends on the text’s reproduction quality. A new technique to enhance dark text on light background of scanned mixed mode documents (containing text, graphics and photo) is presented to improve copy quality. This technique incorporates various image processing filters that enhance dark text without dis...

Journal: :CoRR 2004
S. M. Kamruzzaman Farhana Haider

Text classification is the process of classifying documents into predefined categories based on their content. Existing supervised learning algorithms to automatically classify text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification that requires fewer documents for training. Instead of using words, word relation i.e association rules from...

In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...

2003
Atreya Basu Carolyn R. Watters Michael A. Shepherd

Text categorization is the process of sorting text documents into one or more predefined categories or classes of similar documents. Differences in the results of such categorization arise from the feature set chosen to base the association of a given document with a given category. Advocates of text categorization recognize that the sorting of text documents into categories of like documents r...

Journal: :CoRR 2014
Sandeep Sirsat

The information available on web pages mostly contains semi-structured text documents which are represented either in XML, or HTML, or XHTML format that lacks formatted document structure. The document does not discriminate between the text and the schema that represent the text. Also the amount of structure used to represent the text depends on the purpose and size of text document. No semanti...

2010
B S Harish S Manjunath

Text classification is one of the important research issues in the field of text mining, where the documents are classified with supervised knowledge. In literature we can find many text representation schemes and classifiers/learning algorithms used to classify text documents to the predefined categories. In this paper, we present various text representation schemes and compare different class...

2012
Vishal Gupta

Punjabi Text Classification is the process of assigning predefined classes to the unlabelled text documents. Because of dramatic increase in the amount of content available in digital form, text classification becomes an urgent need to manage the digital data efficiently and accurately. Till now no Punjabi Text Classifier is available for Punjabi Text Documents. Therefore, in this paper, existi...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید