Filtrated Algebraic Subspace Clustering

نویسندگان

  • Manolis C. Tsakiris
  • René Vidal
چکیده

Subspace clustering is the problem of clustering data that lie close to a union of linear subspaces. Existing algebraic subspace clustering methods are based on fitting the data with an algebraic variety and decomposing this variety into its constituent subspaces. Such methods are well suited to the case of a known number of subspaces of known and equal dimensions, where a single polynomial vanishing in the variety is sufficient to identify the subspaces. While subspaces of unknown and arbitrary dimensions can be handled using multiple vanishing polynomials, current approaches are not robust to corrupted data due to the difficulty of estimating the number of polynomials. As a consequence, the current practice is to use a single polynomial to fit the data with a union of hyperplanes containing the union of subspaces, an approach that works well only when the dimensions of the subspaces are high enough. In this paper, we propose a new algebraic subspace clustering algorithm, which can identify the subspace S passing through a point x by constructing a descending filtration of subspaces passing containing S. First, a single polynomial vanishing in the variety is identified and used to find a hyperplane containing S. After intersecting this hyperplane with the variety to obtain a sub-variety, a new polynomial vanishing in the sub-variety is found and so on until no non-trivial vanishing polynomial exists. In this case, our algorithm identifies S as the intersection of the hyperplanes identified thus far. By repeating this procedure for other points, our algorithm eventually identifies all the subspaces. Alternatively, by constructing a filtration at each data point and comparing any two filtrations using a suitable affinity, we propose a spectral version of our algebraic procedure based on spectral clustering, which is suitable for computations with noisy data. We show by experiments on synthetic and real data that the proposed algorithm outperforms state-of-the-art methods on several occasions, thus demonstrating the merit of the idea of filtrations1.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generalized Principal Component Analysis (GPCA): an Algebraic Geometric Approach to Subspace Clustering and Motion Segmentation by

Generalized Principal Component Analysis (GPCA): an Algebraic Geometric Approach to Subspace Clustering and Motion Segmentation

متن کامل

Subspace Clustering with Applications to Dynamical Vision ( CS 229 Final Project )

Data that arises from engineering applications often contains some type of low dimensional structure that enables intelligent representation and processing. This leads to a very challenging problem: discovering compact representations of high-dimensional data. A very common approach to address this problem is modeling data as a mixture of multiple linear (or affine) subspaces. Given a set of da...

متن کامل

Nonlinearly Structured Low-Rank Approximation

Polynomially structured low-rank approximation problems occur in • algebraic curve fitting, e.g., conic section fitting, • subspace clustering (generalized principal component analysis), and • nonlinear and parameter-varying system identification. The maximum likelihood estimation principle applied to these nonlinear models leads to nonconvex optimization problems and yields inconsistent estima...

متن کامل

Hierarchical Subspace Clustering

It is well-known that traditional clustering methods considering all dimensions of the feature space usually fail in terms of efficiency and effectivity when applied to high-dimensional data. This poor behavior is based on the fact that clusters may not be found in the high-dimensional feature space, although clusters exist in subspaces of the feature space. To overcome these limitations of tra...

متن کامل

Effective Evaluation Measures for Subspace Clustering of Data Streams

Nowadays, most streaming data sources are becoming highdimensional. Accordingly, subspace stream clustering, which aims at finding evolving clusters within subgroups of dimensions, has gained a significant importance. However, existing subspace clustering evaluation measures are mainly designed for static data, and cannot reflect the quality of the evolving nature of data streams. On the other ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • SIAM J. Imaging Sciences

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2017