Hybrid Minimal Spanning Tree and Mixture of Gaussians Based Clustering Algorithm

نویسندگان

  • Ágnes Vathy-Fogarassy
  • Attila Kiss
  • János Abonyi
چکیده

Clustering is an important tool to explore the hidden structure of large databases. There are several algorithms based on different approaches (hierarchical, partitional, density-based, model-based, etc.). Most of these algorithms have some discrepancies, e.g. they are not able to detect clusters with convex shapes, the number of the clusters should be a priori known, they suffer from numerical problems, like sensitiveness to the initialization, etc. In this paper we introduce a new clustering algorithm based on the sinergistic combination of the hierarchial and graph theoretic minimal spanning tree based clustering and the partitional Gaussian mixture model-based clustering algorithms. The aim of this hybridization is to increase the robustness and consistency of the clustering results and to decrease the number of the heuristically defined parameters of these algorithms to decrease the influence of the user on the clustering results. As the examples used for the illustration of the operation of the new algorithm will show, the proposed algorithm can detect clusters from data with arbitrary shape and does not suffer from the numerical problems of the Gaussian mixture based clustering algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SOLVING A STEP FIXED CHARGE TRANSPORTATION PROBLEM BY A SPANNING TREE-BASED MEMETIC ALGORITHM

In this paper, we consider the step fixed-charge transportation problem (FCTP) in which a step fixed cost, sometimes called a setup cost, is incurred if another related variable assumes a nonzero value. In order to solve the problem, two metaheuristic, a spanning tree-based genetic algorithm (GA) and a spanning tree-based memetic algorithm (MA), are developed for this NP-hard problem. For compa...

متن کامل

LC Note: LC-TOOL-2004-020 arXiv:physics/0409039 CALORIMETER CLUSTERING WITH MINIMAL SPANNING TREES

We present a top-down approach to calorimeter clustering. An algorithm based on minimal spanning tree theory is described briefly. We present a top-down approach to calorimeter clustering. An algorithm based on minimal spanning tree theory is described briefly.

متن کامل

Estimation of Rényi Information Divergence via Pruned Minimal Spanning Trees

In this paper we develop robust estimators of the Rényi information divergence (I-divergence) given a reference distribution and a random sample from an unknown distribution. Estimation is performed by constructing a minimal spanning tree (MST) passing through the random sample points and applying a change of measure which flattens the reference distribution. In a mixture model where the refere...

متن کامل

Efficient EM Training of Gaussian Mixtures with Missing Data

In data-mining applications, we are frequently faced with a large fraction of missing entries in the data matrix, which is problematic for most discriminant machine learning algorithms. A solution that we explore in this paper is the use of a generative model (a mixture of Gaussians) to compute the conditional expectation of the missing variables given the observed variables. Since training a G...

متن کامل

Non-convex clustering using expectation maximization algorithm with rough set initialization

An integration of a minimal spanning tree (MST) based graph-theoretic technique and expectation maximization (EM) algorithm with rough set initialization is described for non-convex clustering. EM provides the statistical model of the data and handles the associated uncertainties. Rough set theory helps in faster convergence and avoidance of the local minima problem, thereby enhancing the perfo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006