نتایج جستجو برای: text document classification

تعداد نتایج: 765658  

2011
Dat Huynh Dat Tran Wanli Ma Dharmendra Sharma

Term frequency and document frequency are currently used to measure term significance in text classification. However, these measures cannot provide sufficient information to differentiate important terms. Thus, in this research, a new term ranking (weighting) approach for text classification will be proposed. The approach firstly is based on relations among terms to estimates the important lev...

2015
Mans Hulden Miikka Silfverberg Jerid Francom

Text-based geolocation classifiers often operate with a grid-based view of the world. Predicting document location of origin based on text content on a geodesic grid is computationally attractive since many standard methods for supervised document classification carry over unchanged to geolocation in the form of predicting a most probable grid cell for a document. However, the grid-based approa...

Journal: :CoRR 2012
Muhammad Rafi Sundus Hassan Mohammad Shahid Shaikh

The process of text categorization assigns labels or categories to each text document according to the semantic content of the document. The traditional approaches to text categorization used features from the text like: words, phrases, and concepts hierarchies to represent and reduce the dimensionality of the documents. Recently, researchers addressed this brittleness by incorporating backgrou...

2012
P. Perumal

Document clustering is useful in many information retrieval operations such as document browsing, organization and viewing of retrieval results, generation of Yahoo-like hierarchies of documents, etc. The general goal of clustering is to group data elements such that the intra-group similarities are high and the inter-group similarities are low. Generative models based on the multivariate Berno...

2015
Jincy B. Chrystal

Most of the text classification problems are associated with multiple class labels and hence automatic text classification is one of the most challenging and prominent research area. Text classification is the problem of categorizing text documents into different classes. In the multi-label classification scenario, each document is associated may have more than one label. The real challenge in ...

2003
Manabu Sassano

We explore how virtual examples (artificially created examples) improve performance of text classification with Support Vector Machines (SVMs). We propose techniques to create virtual examples for text classification based on the assumption that the category of a document is unchanged even if a small number of words are added or deleted. We evaluate the proposed methods by Reuters-21758 test se...

2014
Eman Al-Thwaib

Text classification (TC) or text categorization task is assigning a document to one or more predefined classes or categories. A common problem in TC is the high number of terms or features in document(s) to be classified (the curse of dimensionality). This problem can be solved by selecting the most important terms. In this study, an automatic text summarization is used for feature selection. S...

2014
Mickaël Poussevin Élie Guàrdia-Sebaoun Vincent Guigue Patrick Gallinari

Sentiment classification and recommender systems were until recently completely disjoint domains. Recommender systems exploit the users/items/rates matrix with omitting the available text information. Sentiment classification exploits text reviews and consumers rates to build models for document analysis. In this article we propose an unified model exploiting both text and user, items and rates...

2010
Valeriana G. Roncero Myrian C. A. Costa Nelson F. F. Ebecken

The enormous amount of information stored in unstructured texts cannot simply be used for further processing by computers, which typically handle text as simple sequences of character strings. Text mining is the process of extracting interesting information and knowledge from unstructured text. One key difficulty with text classification learning algorithms is that they require many hand-labele...

2004
Stephan Bloehdorn Andreas Hotho

Current text classification systems typically use term stems for representing document content. Ontologies allow the usage of features on a higher semantic level than single words for text classification purposes. In this paper we propose such an enhancement of the classical document representation through concepts extracted from background knowledge. Boosting, a successful machine learning tec...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید