Spectral clustering with eigenvector selection

نویسندگان

  • Tao Xiang
  • Shaogang Gong
چکیده

The task of discovering natural groupings of input patterns, or clustering, is an important aspect machine learning and pattern analysis. In this paper, we study the widely-used spectral clustering algorithm which clusters data using eigenvectors of a similarity/affinity matrix derived from a data set. In particular, we aim to solve two critical issues in spectral clustering: (1) How to automatically determine the number of clusters? and (2) How to perform effective clustering given noisy and sparse data? An analysis of the characteristics of eigenspace is carried out which shows that (a) Not every eigenvectors of a data affinity matrix is informative and relevant for clustering; (b) Eigenvector selection is critical because using uninformative/irrelevant eigenvectors could lead to poor clustering results; and (c) The corresponding eigenvalues cannot be used for relevant eigenvector selection given a realistic data set. Motivated by the analysis, a novel spectral clustering algorithm is proposed which differs from previous approaches in that only informative/relevant eigenvectors are employed for determining the number of clusters and performing clustering. The key element of the proposed algorithm is a simple but effective relevance learning method which measures the relevance of an eigenvector according to how well it can separate the data set into different clusters. Our algorithm was evaluated using synthetic data sets as well as real-world data sets generated from two challenging visual learning problems. The results demonstrated that our algorithm is able to estimate the cluster number correctly and reveal natural grouping of the input data/patterns even given sparse and noisy data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spectral clustering with eigenvector selection based on entropy ranking

Ng–Jordan–Weiss (NJW) method is one of the most widely used spectral clustering algorithms. For a K clustering problem, this method partitions data using the largest K eigenvectors of the normalized affinity matrix derived from the dataset. It has been demonstrated that the spectral relaxation solution of K-way grouping is located on the subspace of the largest K eigenvectors. However, we find ...

متن کامل

Spectral 3D mesh segmentation with a novel single segmentation field

We present an automatic mesh segmentation framework, which achieves 3D segmentation in two stages, comprising hierarchical spectral analysis and isolinebased boundary detection. During hierarchical spectral analysis, a novel single segmentation field is defined to capture concavity-aware decompositions of eigenvectors from a concavity-aware Laplacian. Specifically, on the eigenvector hierarchy,...

متن کامل

Supervised and Unsupervised Clustering with Probabilistic Shift

We present a novel scale adaptive, nonparametric approach to clustering point patterns. Clusters are detected by moving all points to their cluster cores using shift vectors. First, we propose a novel scale selection criterion based on local density isotropy which determines the neighborhoods over which the shift vectors are computed. We then construct a directed graph induced by these shift ve...

متن کامل

Spectral Clustering by Recursive

In this paper, we analyze the second eigenvector technique of spectral partitioning on the planted partition random graph model, by constructing a recursive algorithm using the second eigenvectors in order to learn the planted partitions. The correctness of our algorithm is not based on the ratio-cut interpretation of the second eigenvector, but exploits instead the stability of the eigenvector...

متن کامل

Spectral Clustering by Recursive Partitioning

In this paper, we analyze the second eigenvector technique of spectral partitioning on the planted partition random graph model, by constructing a recursive algorithm using the second eigenvectors in order to learn the planted partitions. The correctness of our algorithm is not based on the ratio-cut interpretation of the second eigenvector, but exploits instead the stability of the eigenvector...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition

دوره 41  شماره 

صفحات  -

تاریخ انتشار 2008