Concept Tree Based Clustering Visualization with Shaded Similarity Matrices
نویسندگان
چکیده
One of the problems with existing clustering methods is that the interpretation of clusters may be difficult. Two different approaches have been used to solve this problem: conceptual clustering in machine learning and clustering visualization in statistics and graphics. The purpose of this paper is to investigate the benefits of combining clustering visualization and conceptual clustering to obtain better cluster interpretations. In our research we have combined concept trees for conceptual clustering with shaded similarity matrices for visualization. Experimentation shows that the two interpretation approaches can complement each other to help us understand data better.
منابع مشابه
Classification Visualization with Shaded Similarity Matrix
Shaded similarity matrix has long been used in visual cluster analysis. This paper investigates how it can be used in classification visualization. We focus on two popular classification methods: nearest neighbor and decision tree. Ensemble classifier visualization is also presented for handling large data sets.
متن کاملClustering of Time-Series Data Streams
This paper presents a time-series whole clustering system that incrementally constructs a tree-like hierarchy of clusters. The Online DivisiveAgglomerative Clustering (ODAC) system uses a correlation-based similarity measure between time-series over a data stream. When turning a leaf into a node, the cluster is divided in two and new leaves start new computations. An agglomerative phase is used...
متن کاملA partition-based algorithm for clustering large-scale software systems
Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...
متن کاملVisualization of Small World Networks Using Similarity Matrices
Visualization of small world networks is challenging owing to the large size of the data and its property of being “locally dense but globally sparse.” Generally networks are represented using graph layouts and images of adjacency matrices, which have shortcomings of occlusion and spatial complexity in its direct form. These shortcomings are usually alleviated using pixel displays, hierarchical...
متن کاملخوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کامل