Robust Clustering in Arbitrarily Oriented Subspaces

نویسندگان

  • Elke Achtert
  • Christian Böhm
  • Jörn David
  • Peer Kröger
  • Arthur Zimek
چکیده

In this paper, we propose an efficient and effective method to find arbitrarily oriented subspace clusters by mapping the data space to a parameter space defining the set of possible arbitrarily oriented subspaces. The objective of a clustering algorithm based on this principle is to find those among all the possible subspaces, that accommodate many database objects. In contrast to existing approaches, our method can find subspace clusters of different dimensionality even if they are sparse or are intersected by other clusters within a noisy environment. A broad experimental evaluation demonstrates the robustness, efficiency and effectivity of our method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Subspace Clustering

It is well-known that traditional clustering methods considering all dimensions of the feature space usually fail in terms of efficiency and effectivity when applied to high-dimensional data. This poor behavior is based on the fact that clusters may not be found in the high-dimensional feature space, although clusters exist in subspaces of the feature space. To overcome these limitations of tra...

متن کامل

Global Correlation Clustering Based on the Hough Transform

In this article, we propose an efficient and effective method for finding arbitrarily oriented subspace clusters by mapping the data space to a parameter space defining the set of possible arbitrarily oriented subspaces. The objective of a clustering algorithm based on this principle is to find those among all the possible subspaces that accommodate many database objects. In contrast to existin...

متن کامل

Hard-LOST: Modified k-Means for Oriented Lines

Robust clustering of data into linear subspaces is a common problem. Here we treat clustering into one-dimensional subspaces that cross the origin. This problem arises in blind source separation, where the subspaces correspond directly to columns of a mixing matrix. We present an algorithm that identifies these subspaces using a modified k-means procedure, where line orientations and distances ...

متن کامل

SONAR: Signal De-mixing for Robust Correlation Clustering

Clustering is one of the most fundamental challenges in data mining. We identified three core problems which turn finding a natural grouping of a data set into a difficult task: First, clusters may exist in arbitrarily oriented subspaces of various dimensionality (also known as correlation clusters). Secondly, the cluster structure may be hidden by noise and outliers. Finally, the number, size ...

متن کامل

Learning Transformations for Clustering and Classification Learning Transformations for Clustering and Classification

A low-rank transformation learning framework for subspace clustering and classification is here proposed. Many high-dimensional data, such as face images and motion sequences, approximately lie in a union of low-dimensional subspaces. The corresponding subspace clustering problem has been extensively studied in the literature to partition such highdimensional data into clusters corresponding to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008