Maximin Separation Probability Clustering

نویسندگان

  • Gao Huang
  • Jianwen Zhang
  • Shiji Song
  • Zheng Chen
چکیده

This paper proposes a new approach for discriminative clustering. The intuition is, for a good clustering, one should be able to learn a classifier from the clustering labels with high generalization accuracy. Thus we define a novel metric to evaluate the quality of a clustering labeling, named Minimum Separation Probability (MSP), which is a lower bound of the generalization accuracy of a classifier learnt from the clustering labeling. We take MSP as the objective to maximize and propose our approach Maximin Separation Probability Clustering (MSPC), which has several attractive properties, such as invariance to anisotropic feature scaling and intuitive probabilistic explanation for clustering quality. We present three efficient optimization strategies for MSPC, and analyze their interesting connections to existing clustering approaches, such as maximum margin clustering (MMC) and discriminative k-means. Empirical results on real world data sets verify that MSP is a robust and effective clustering quality measure. It is also shown that the proposed algorithms compare favorably to state-of-the-art clustering algorithms in both accuracy and efficiency.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bounds for Maximin Latin Hypercube Designs

Latin hypercube designs (LHDs) play an important role when approximating computer simulation models. To obtain good space-filling properties, the maximin criterion is frequently used. Unfortunately, constructing maximin LHDs can be quite time-consuming when the number of dimensions and design points increase. In these cases, we can use approximate maximin LHDs. In this paper, we construct bound...

متن کامل

K-maximin clustering: a maximin correlation approach to partition-based clustering

We propose a new clustering algorithm based upon the maximin correlation analysis (MCA), a learning technique that can minimize the maximum misclassification risk. The proposed algorithm resembles conventional partition clustering algorithms such as k-means in that data objects are partitioned into k disjoint partitions. On the other hand, the proposed approach is unique in that an MCA-based ap...

متن کامل

A Probability-Based Combination Method for Unsupervised Clustering with Application to Blind Source Separation

Unsupervised clustering algorithms can be combined to improve the robustness and the quality of the results, e.g. in blind source separation. Before combining the results of these clustering methods the corresponding clusters have to be aligned, but usually it is not known which clusters of the employed methods correspond to each other. In this paper, we present a method to avoid this correspon...

متن کامل

Selection Operators Based on Maximin Fitness Function for Multi-Objective Evolutionary Algorithms

We propose three operators based on MFF. The first uses MFF. The second uses MFF when applies Maximin-Constraint and uses modified MFF when applies Maximin-Clustering. The third uses modified MFF. According to the results, the three operators are competitive to solve multi-objective optimization problems having both low dimensionality (two or three) and high dimensionality (more than three) in ...

متن کامل

Recovery guarantees for exemplar-based clustering

For a certain class of distributions, we prove that the linear programming relaxation of kmedoids clustering—a variant of k-means clustering where means are replaced by exemplars from within the dataset—distinguishes points drawn from nonoverlapping balls with high probability once the number of points drawn and the separation distance between any two balls are sufficiently large. Our results h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015