Mining Clustering Dimensions

نویسندگان

Sajib Dasgupta

Vincent Ng

چکیده

• Presented an algorithm that learns and helps users visualize important clustering dimensions of a dataset. • Future work involves quantifying the multi-clusterability and ambiguity of a dataset Conclusion and Future Work • Producing the optimal clustering • Spectral clustering, objective function: normalized cut • Optimal partitioning function f: • f=e2, the second eigenvector of the Laplacian • Apply k-means to cluster the data points represented by e2

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Density-Based Clustering of Streaming Data Using Weighting Scheme

Clustering of data streams is an important issue in data mining. A large number of algorithms exist for clustering data streams but most of these algorithms give equal weights to all the dimensions of the data stream. Some of the dimensions of the data stream may play important role in clustering while some may be just useless. In this paper, we introduce a density based algorithm in which the ...

متن کامل

High Dimensional Data Clustering through Efficient Evolutionary Algorithm

Dimensionality reduction is essential in multidimensional data mining since the dimensionality of real time data could easily extend to higher dimensions. Most recent efforts on dimensionality reduction, however, are not adequate for multidimensional data due to lack of scalability. In this paper, we use the evolutionary algorithm for the dimension reduction process. Initially, our proposed evo...

متن کامل

Prediction mining Generalization-Based clustering method

Mining user patterns of log files can provide significant and useful informative knowledge. This paper present an approach for mining similarity of interest among web users from their past access behaviors. Unlike traditional clustering methods that focus on grouping objects with similar values on a set of dimensions, clustering by pattern similarity finds objects that exhibit a coherent patter...

متن کامل

Soft Subspace Clustering for High-Dimensional Data

High dimensional data is a phenomenon in real-world data mining applications. Text data is a typical example. In text mining, a text document is viewed as a vector of terms whose dimension is equal to the total number of unique terms in a data set, which is usually in thousands. High dimensional data occurs in business as well. In retails, for example, to effectively manage supplier relationshi...

متن کامل

Mafia: Eecient and Scalable Subspace Clustering for Very Large Data Sets Center for Parallel and Distributed Computing Mafia: Eecient and Scalable Subspace Clustering for Very Large Data Sets

Clustering techniques are used in database mining for nding interesting patterns in high dimensional data. These are useful in various applications of knowledge discovery in databases. Some challenges in clustering for large data sets in terms of scalability, data distribution, understanding end-results, and sensitivity to input order, have received attention in the recent past. Recent approach...

متن کامل

Simultaneous Clustering: A Survey

Although most of the clustering literature focuses on onesided clustering algorithms, simultaneous clustering has recently gained attention as a powerful tool that allows to circumvent some limitations of classical clustering approach. Simultaneous clustering methods perform clustering in the two dimensions simultaneously. In this paper, we introduce a large number of existing simultaneous clus...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Mining Clustering Dimensions

نویسندگان

چکیده

منابع مشابه

Density-Based Clustering of Streaming Data Using Weighting Scheme

High Dimensional Data Clustering through Efficient Evolutionary Algorithm

Prediction mining Generalization-Based clustering method

Soft Subspace Clustering for High-Dimensional Data

Mafia: Eecient and Scalable Subspace Clustering for Very Large Data Sets Center for Parallel and Distributed Computing Mafia: Eecient and Scalable Subspace Clustering for Very Large Data Sets

Simultaneous Clustering: A Survey

عنوان ژورنال:

اشتراک گذاری