Optimizing the Cauchy-Schwarz PDF Distance for Information Theoretic, Non-parametric Clustering

نویسندگان

  • Robert Jenssen
  • Deniz Erdogmus
  • Kenneth E. Hild
  • José Carlos Príncipe
  • Torbjørn Eltoft
چکیده

This paper addresses the problem of efficient information theoretic, non-parametric data clustering. We develop a procedure for adapting the cluster memberships of the data patterns, in order to maximize the recent Cauchy-Schwarz (CS) probability density function (pdf) distance measure. Each pdf corresponds to a cluster. The CS distance is estimated analytically and non-parametrically by means of the Parzen window technique for density estimation. The resulting form of the cost function makes it possible to develop an efficient adaption procedure based on constrained gradient descent, using stochastic approximation of the gradients. The computational complexity of the algorithm is O(MN), M N , where N is the total number of data patterns and M is the number of data patterns used in the stochastic approximation. We show that the new algorithm is capable of performing well on several odd-shaped and irregular data sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Schwarz boundary problem on a triangle

In this paper, the Schwarz boundary value problem (BVP) for the inhomogeneous Cauchy-Riemann equation in a triangle is investigated explicitly. Firstly, by the technique of parquetingreflection and the Cauchy-Pompeiu representation formula a modified Cauchy-Schwarz representation formula is obtained. Then, the solution of the Schwarz BVP is explicitly solved. In particular, the boundary behavio...

متن کامل

A Non-parametric Maximum Entropy Clustering

Clustering is a fundamental tool for exploratory data analysis. Information theoretic clustering is based on the optimization of information theoretic quantities such as entropy and mutual information. Recently, since these quantities can be estimated in non-parametric manner, non-parametric information theoretic clustering gains much attention. Assuming the dataset is sampled from a certain cl...

متن کامل

A Convex Cauchy-Schwarz DivergenceMeasure for Blind Source Separation

Independent Component Analysis (ICA) for the demixing of multiple source mixtures. We call it the Convex Cauchy-Schwarz Divergence (CCS-DIV), and it is formed by integrating convex functions into the Cauchy-Schwarz inequality. The new measure is symmetric and the degree of its curvature with respect to the joint-distribution can be tuned by a (convexity) parameter. The CCS-DIV is able to speed-...

متن کامل

Clustering with Bregman Divergences

A wide variety of distortion functions, such as squared Euclidean distance, Mahalanobis distance, Itakura-Saito distance and relative entropy, have been used for clustering. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergences. The proposed algorithms unify centroid-based parametric clust...

متن کامل

Convex Cauchy Schwarz Independent Component Analysis for Blind Source Separation

—We present a new high-performance Convex Cauchy– Schwarz Divergence (CCS-DIV) measure for Independent Component Analysis (ICA) and Blind Source Separation (BSS). The CCS-DIV measure is developed by integrating convex functions into the Cauchy–Schwarz inequality. By including a convexity quality parameter, the measure has a broad control range of its convexity curvature. With this measure, a ne...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005