نتایج جستجو برای: text documents classification

تعداد نتایج: 694633  

2016
Shivani Kundra

For the last few years, text mining has been gaining significant importance. Since Knowledge is now available to users through variety of sources i.e. electronic media, digital media, print media, and many more. Due to huge availability of text in numerous forms, a lot of unstructured data has been recorded by research experts and have found numerous ways in literature to convert this scattered...

2017
Long Ma

Text classification, the task of metadata to documents, requires significant time and effort when performed by humans. Moreover, with online-generated content explosively growing, it becomes a challenge for manually annotating with large scale and unstructured data. Currently, lots of state-or-art text mining methods have been applied to classification process, many of them based on the key wor...

Journal: :J. Artif. Intell. Res. 2016
Alejandro Moreo Andrea Esuli Fabrizio Sebastiani

Multilingual Text Classification (MLTC) is a text classification task in which documents are written each in one among a set L of natural languages, and in which all documents must be classified under the same classification scheme, irrespective of language. There are two main variants of MLTC, namely Cross-Lingual Text Classification (CLTC) and Polylingual Text Classification (PLTC). In PLTC, ...

2007
Tomas Berg Christian Mårtenson Pontus Svenson

In this paper, we discuss how text mining methods could be used for intelligence analysis. We describe how simple methods from text mining can be used to help intelligence analysts determine where a specific report or analysis fits into the knowledge base (KB), i.e., how it should be classified and which, if any, other documents in the KB it should be linked to. The method works by comparing th...

2008
Tomas Berg Christian Mårtenson Pontus Svenson

In this paper, we discuss how text mining methods could be used in a mixedinitiative interaction approach to intelligence analysis. We describe how simple methods from text mining can be used to help intelligence analysts determine where a specific report or analysis fits into the knowledge base (KB), i.e., how it should be classified and which, if any, other documents in the KB it should be li...

Journal: :J. UCS 2013
Christin Seifert Eva Ulbrich Roman Kern Michael Granitzer

In text classification the amount and quality of training data is crucial for the performance of the classifier. The generation of training data is done by human labellers a tedious and time-consuming work. To reduce the labelling time for single documents we propose to use condensed representations of text documents instead of the full-text document. These condensed representations are key sen...

2003
Raghu Krishnapuram Krishna Prasad Chitrapura Sachindra Joshi

In this paper, we describe a new approach to classification of text documents based on the minimization of system entropy, i.e., the overall uncertainty associated with the joint distribution of words and labels in the collection. The classification algorithm assigns a class label to a new document in such a way that its insertion into the system results in the maximum decrease (or least increa...

Journal: :Expert Syst. Appl. 2014
Kwanho Kim Beom-Suk Chung Ye Rim Choi Seungyun Lee Jae-Yoon Jung Jonghun Park

Short-text classification is increasingly used in a wide range of applications. However, it still remains a challenging problem due to the insufficient nature of word occurrences in short-text documents, although some recently developed methods which exploit syntactic or semantic information have enhanced performance in short-text classification. The language-dependency problem, however, caused...

2002
Venu Dasigi Reinhold C. Mann

In intelligent analysis of large amounts of text, not any single clue indicates reliably that a pattern of interest has been found. When using multiple clues, it is not known how these should be integrated into a decision. In the context of this investigation, we have been using neural nets as parameterized mappings that allow for fusion of higher level clues extracted from free text. By using ...

2006
Angelo Dalli Yorick Wilks

The frequency of occurrence of words in natural languages exhibits a periodic and a non-periodic component when analysed as a time series. This work presents an unsupervised method of extracting periodicity information from text, enabling time series creation and filtering to be used in the creation of sophisticated language models that can discern between repetitive trends and non-repetitive w...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید