text clustering

نتایج جستجو برای: text clustering

تعداد نتایج: 264479 فیلتر نتایج به سال:

Clustering Massive Text Data Streams by Semantic Smoothing Model

2007

Yubao Liu Jiarong Cai Jian Yin Ada Wai-Chee Fu

Clustering text data streams is an important issue in data mining community and has a number of applications such as news group filtering, text crawling, document organization and topic detection and tracing etc. However, most methods are similarity-based approaches and use the TF*IDF scheme to represent the semantics of text data and often lead to poor clustering quality. In this paper, we fir...

متن کامل

Optimization and Application of OPTICS Algorithm on Text Clustering

2013

Bo Shen Ying-Si Zhao

Text clustering is of great importance in data mining, information fusion, artificial intelligence and some other fields. There are many methods in literatures that can be used to classify text. Most of them require some parameters, such as the number of categories, which should be assigned in advance or estimated in classifying process. However, it is difficult to determine these quantities in...

متن کامل

An Ant Colony-based Text Clustering System with Cognitive Situation Dimensions

Journal: :Int. J. Computational Intelligence Systems 2015

Yi Guo Yan Li Zhiqing Shao

In order to build human cognition features into the procedure of clustering, this paper introduces a novel text clustering system, CogTCA (Cognitive Text Clustering with Ants), which (1) represents texts according to four cognitive situation dimensions in form of cognitive situation matrices and vectors rather than canonical sparse matrices of high dimensions, (2) proposes several new similarit...

متن کامل

Evaluating Text Clustering Methods for Text Classification

2007

Mehrbod Sharifi

In this project report, I will evaluate the several text clustering approaches and how they can be used for the purpose of text classification. The particular task is topic classification of 20 Newsgroup dataset and sentiment classification restaurant reviews dataset. Future direction for improving the results will also be discussed.

متن کامل

Less-redundant Text Summarization using Ensemble Clustering Algorithm based on GA and PSO

2017

JUNG SONG LEE HAN HEE HAHM SOON CHEOL PARK

In this paper, a novel text clustering technique is proposed to summarize text documents. The clustering method, so called ‘Ensemble Clustering Method’, combines both genetic algorithms (GA) and particle swarm optimization (PSO) efficiently and automatically to get the best clustering results. The summarization with this clustering method is to effectively avoid the redundancy in the summarized...

متن کامل

Noisy Text Clustering

2004

David Grangier Alessandro Vinciarelli

This work presents document clustering experiments performed over noisy texts (i.e. text that have been extracted through an automatic process like speech or character recognition). The effect of recognition errors on different clustering techniques is measured through the comparison of the results obtained with clean (manually typed texts) and noisy (automatic speech transcripts affected by 30...

متن کامل

Introduction to Text Clustering

2008

Magnus Rosell

متن کامل

Ontology-based Text Clustering

2001

A. Hotho S. Staab A. Maedche

Text clustering typically involves clustering in a high dimensional space, which appears difficult with regard to virtually all practical settings. In addition, given a particular clustering result it is typically very hard to come up with a good explanation of why the text clusters have been constructed the way they are. In this paper, we propose a new approach for applying background knowledg...

متن کامل

Text Classification Using Clustering

2006

Antonia Kyriakopoulou Theodore Kalamboukis

This paper addresses the problem of learning to classify texts by exploiting information derived from both training and testing sets. To accomplish this, clustering is used as a complementary step to text classification, and is applied not only to the training set but also to the testing set. This approach allows us to estimate the location of the testing examples and the structure of the whole...

متن کامل

Performance Evaluation of an Efficient Frequent Item sets-Based Text Clustering Approach

2010

S.Murali Krishna

The vast amount of textual information available in electronic form is growing at a staggering rate in recent times. The task of mining useful or interesting frequent itemsets (words/terms) from very large text databases that are formed as a result of the increasing number of textual data still seems to be a quite challenging task. A great deal of attention in research community has been receiv...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید