خوشهبندی k means

k*-Means: A new generalized k-means clustering algorithm

Journal: :Pattern Recognition Letters 2003

Yiu-ming Cheung

This paper presents a generalized version of the conventional k-means clustering algorithm [Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, 1, University of California Press, Berkeley, 1967, p. 281]. Not only is this new one applicable to ellipse-shaped data clusters without dead-unit problem, but also performs correct clustering without pre-assigning the exact...

متن کامل

حل مسائل خوشه بندی با استفاده از بهینه سازی شبیه سازی حرارتی

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه شیراز - دانشکده علوم 1388

زهره السادات ناظمی, کورش زیارتی, محمدباقر احمدی, عبدالعزیز عبدالهی,

خوشه بندی فرایندی است که در طی آن مجموعه ای از نمونه ها به خوشه هایی تقسیم می شوند که اعضای هرخوشه بیشترین شباهت را به یکدیگر داشته باشند و خوشه های مختلف با یکدیگر بیشترین تفاوت را داشته باشند. خوشه بندی یکی از تکنیک های داده کاوی و آنالیز داده متعارف می باشد. درخوشه بندی داده ها، در مسائل با اندازه داده بزگتر رسیدن به حل بهینه مشکل تر می باشد و در نتیجه مدت زمان لازم برای رسیدت به حل های قابل...

15 صفحه اول

Analysis of k-Means++ for Separable Data

2012

Ragesh Jaiswal Nitin Garg

k-means++ [5] seeding procedure is a simple sampling based algorithm that is used to quickly find k centers which may then be used to start the Lloyd’s method. There has been some progress recently on understanding this sampling algorithm. Ostrovsky et al. [10] showed that if the data satisfies the separation condition that ∆k−1(P ) ∆k(P ) ≥ c (∆i(P ) is the optimal cost w.r.t. i centers, c > 1...

متن کامل

Nyström Method with Kernel K-means++ Samples as Landmarks

2017

Dino Oglic Thomas Gärtner

We investigate, theoretically and empirically, the effectiveness of kernel K-means++ samples as landmarks in the Nyström method for low-rank approximation of kernel matrices. Previous empirical studies (Zhang et al., 2008; Kumar et al., 2012) observe that the landmarks obtained using (kernel) K-means clustering define a good lowrank approximation of kernel matrices. However, the existing work d...

متن کامل

k-means++ under Approximation Stability

Journal: :Theor. Comput. Sci. 2013

Manu Agarwal Ragesh Jaiswal Arindam Pal

The Lloyd’s algorithm, also known as the k-means algorithm, is one of the most popular algorithms for solving the k-means clustering problem in practice. However, it does not give any performance guarantees. This means that there are datasets on which this algorithm can behave very badly. One reason for poor performance on certain datasets is bad initialization. The following simple sampling ba...

متن کامل

A Fast Approximation Scheme for Low-Dimensional k-Means

2018

Vincent Cohen-Addad

We consider the popular k-means problem in d-dimensional Euclidean space. Recently Friggstad, Rezapour, Salavatipour [FOCS’16] and Cohen-Addad, Klein, Mathieu [FOCS’16] showed that the standard local search algorithm yields a p1`εq-approximation in time pn ̈kq Opdq , giving the first polynomialtime approximation scheme for the problem in low-dimensional Euclidean space. While local search achiev...

متن کامل

Unsupervised Learning of Acoustic Events Using Dynamic Time Warping and Hierarchical K-Means++ Clustering

2011

Joerg Schmalenstroeer Markus Bartek Reinhold Häb-Umbach

In this paper we propose to jointly consider Segmental Dynamic Time Warping and distance clustering for the unsupervised learning of acoustic events. As a result, the computational complexity increases only linearly with the dababase size compared to a quadratic increase in a sequential setup, where all pairwise SDTW distances between segments are computed prior to clustering. Further, we discu...

متن کامل

Comparative Study of k-means and k-Means++ Clustering Algorithms on Crime Domain

Journal: :JCS 2014

Bashar Aubaidan Masnizah Mohd Mohammed Albared

This study presents the results of an experimental study of two document clustering techniques which are kmeans and k-means++. In particular, we compare the two main approaches in crime document clustering. The drawback of k-means is that the user needs to define the centroid point. This becomes more critical when dealing with document clustering because each center point represented by a word ...

متن کامل

Unsupervised learning approach to automation of hammering test using topological information

2017

Jun Younes Louhi Kasahara Hiromitsu Fujii Atsushi Yamashita Hajime Asama

In this paper we present an online unsupervised method based on clustering to find defects in concrete structures using hammering. First, the initial dataset of sound samples is roughly clustered using the k-means algorithm with the k-means++ seeding procedure in order to find the cluster best representative of the structure. Then the regular model for the hammering sound, the centroid of this ...

متن کامل

طبقه بندی خودکار طیف های ستاره ای داده های sdss_dr9 با استفاده از شبکه های عصبی مصنوعی

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه زنجان - دانشکده علوم 1393

شکوفه خیردستان, مهدی بازرگان,

در این پروژه شبکه عصبی احتمالی، الکوریتم k-means و تحلیل مولفه های اصلی برای طبقه بندی خودکار طیف های ستاره ای به کارگرفته شده اند. برای رسیدن به این هدف،ازمجموعه طیف های ستاره ای جمع آوری شده توسط sloandigitalskysurveysegue-dr9 و dr10 استفاده شده است، که شامل 400013 طیف با بازه مشترک طول موجی 3850تا 8900 آنگستروم می باشد. طیف های ستاره ای اغلب شامل مقدار زیادی اطلاعات اضافی یا نوفه می باشند...