DBSC: A Dependency-Based Subspace Clustering Algorithm for High Dimensional Numerical Datasets

نویسندگان

  • Xufei Wang
  • Chunping Li
چکیده

We present a novel algorithm called DBSC, which finds subspace clusters in numerical datasets based on the concept of “dependency”. This algorithm uses a depth-first search strategy to find out the maximal subspaces: a new dimension is added to current k-subspace and its validity as a (k 1)-subspace is evaluated. The clusters within those maximal subspaces are mined in a similar fashion as maximal subspace mining does. With the experiments on synthetic and real datasets, our algorithm is shown to be both e ective and eÆcient for high dimensional datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High-Dimensional Unsupervised Active Learning Method

In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...

متن کامل

Holo-Entropy Based Categorical Data Hierarchical Clustering

Clustering high-dimensional data is a challenging task in data mining, and clustering high-dimensional categorical data is even more challenging because it is more difficult to measure the similarity between categorical objects. Most algorithms assume feature independence when computing similarity between data objects, or make use of computationally demanding techniques such as PCA for numerica...

متن کامل

A Robust k-Means Type Algorithm for Soft Subspace Clustering and Its Application to Text Clustering

Soft subspace clustering are effective clustering techniques for high dimensional datasets. Although several soft subspace clustering algorithms have been developed in recently years, its robustness should be further improved. In this work, a novel soft subspace clustering algorithm RSSKM are proposed. It is based on the incorporation of the alternative distance metric into the framework of kme...

متن کامل

Exploring Constraints Inconsistence for Value Decomposition and Dimension Selection Using Subspace Clustering

The datasets which are in the form of object-attribute-time is referred to as threedimensional (3D) data sets. As there are many timestamps in 3D datasets, it is very difficult to cluster. So a subspace clustering method is applied to cluster 3D data sets. Existing algorithms are inadequate to solve this clustering problem. Most of them are not actionable (ability to suggest profitable or benef...

متن کامل

Temporal Subspace Clustering for Unsupervised Action Segmentation

Action segmentation (segmenting a continuous sequence of motion data into a set of actions) has a wide range of applications and plays a role in many problems in computer vision. We look at subspace clustering as an unsupervised approach for this task. Classical subspace clustering methods uncover relationships within the data by learning codes for the samples (i.e. frames), but in this process...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007