نتایج جستجو برای: text clustering

تعداد نتایج: 264479  

2007
Xavier Sevillano Germán Cobo Francesc Alías Joan Claudi Socoró

A major problem encountered by text clustering practitioners is the difficulty of determining a priori which is the optimal text representation and clustering technique for a given clustering problem. As a step towards building robust document partitioning systems, we present a strategy based on a hierarchical consensus clustering architecture that operates on a wide diversity of document repre...

2014
Sunita Sarkar Arindam Roy B. S. Purkayastha

The volume of digitized text documents on the web have been increasing rapidly. As there is huge collection of data on the web there is a need for grouping(clustering) the documents into clusters for speedy information retrieval. Clustering of documents is collection of documents into groups such that the documents within each group are similar to each other and not to documents of other groups...

2014
Dilpreet Kaur Shruti Aggarwal

The explosive growth of information stored in unstructured texts created a great demand for new and powerful tools to acquire useful information, such as text mining. Document clustering is one of its the powerful methods and by which document retrieval, organization and summarization can be achieved. Text documents are the unstructured databases that contain raw data collection. The clustering...

2014
Svetlana Popova Ivan Khodyrev Irina Ponomareva Tatiana Krivosheeva

Abstract. This paper deals with the clustering task for Russian texts obtained using automatic speech recognition (ASR). The input for processing are recognition result for phone call recordings and manual text transcripts for these calls. We present a comparative analysis of clustering results for recognition texts and manual text transcripts, make an evaluation of how recognition quality affe...

2012
Sandeep Kumar Mathariya Vishakha Soni Anand Sen Ranu Soni

The search for interesting information in a huge data collection is a tough job frustrating the seekers for that information. The automatic text summarization has come to facilitate such searching process. Automatic text summarization is to compress an original document into an abridged version by extracting almost all of the essential concepts with text mining techniques. The selection of dist...

2009
Manish Sinha

Overview Amongst various analyses performed on patents, the area where specialized software helps immensely is text‐mining and two of the most popular text mining techniques used over patent data are: ƒ Text segmentation / Tokenization ƒ Text Clustering / Topic identification Text segmentation is a process of analyzing the patent text and identifying smaller meaningful segments from the text. T...

2010
Reynaldo Gil-García Aurora Pons-Porrata

Feature selection has improved the performance of text clustering. In this paper, a local feature selection technique is incorporated in the dynamic hierarchical compact clustering algorithm to speed up the computation of similarities. We also present a quality measure to evaluate hierarchical clustering that considers the cost of finding the optimal cluster from the root. The experimental resu...

2013
Noha Negm Mohamed Amin Passent Elkafrawy Abdel Badeeh M. Salem

Document Clustering is one of the main themes in text mining. It refers to the process of grouping documents with similar contents or topics into clusters to improve both availability and reliability of text mining applications. Some of the recent algorithms address the problem of high dimensionality of the text by using frequent termsets for clustering. Although the drawbacks of the Apriori al...

2012
Antonia Kyriakopoulou

Supervised and unsupervised learning have been the focus of critical research in the areas of machine learning and artificial intelligence. In the literature, these two streams flow independently of each other, despite their close conceptual and practical connections. In this work we exclusively deal with the text classification aided by clustering scenario. This chapter provides a review and i...

2003
Shi Zhong Joydeep Ghosh

Generative models based on the multivariate Bernoulli and multinomial distributions have been widely used for text classification. Recently, the spherical k-means algorithm, which has desirable properties for text clustering, has been shown to be a special case of a generative model based on a mixture of von Mises-Fisher (vMF) distributions. This paper compares these three probabilistic models ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید