PCA-guided search for K-means
نویسندگان
چکیده
K-means is undoubtedly themostwidely used partitional clustering algorithm. Unfortunately, due to the nonconvexity of the model formulations, expectation-maximization (EM) type algorithms converge to different local optima with different initializations. Recent discoveries have identified that the global solution of K-means cluster centroids lies in the principal component analysis (PCA) subspace. Based on this insight, we propose PCA-guided effective search for K-means. Because the PCA subspace ismuch smaller than the original space, searching in the PCA subspace is both more effective and efficient. Extensive experiments on four real world data sets and systematic comparisonwith previous algorithms demonstrate that our proposedmethod outperforms the rest as it makes the K-means more effective. © 2015 Elsevier B.V. All rights reserved.
منابع مشابه
Comparing k-means clusters on parallel Persian-English corpus
This paper compares clusters of aligned Persian and English texts obtained from k-means method. Text clustering has many applications in various fields of natural language processing. So far, much English documents clustering research has been accomplished. Now this question arises, are the results of them extendable to other languages? Since the goal of document clustering is grouping of docum...
متن کاملتشخیص، شناسایی و جداسازی عیب توربین گاز پالایشگاه دوم پارس جنوبی با استفاده از روشهای ترکیبی دادهکاوی، k-means، تحلیل مؤلفههای اصلی (PCA) و ماشین بردار پشتیبان (SVM)
در این مقاله، به تشخیص، شناسایی و جداسازی عیب توربین گاز پرداخته شده است. در ابتدا، با استفاده از الگوریتم k-means، به کاهش بعد دادههای اولیه پرداخته شده و سپس با پیادهسازی تحلیل مؤلفههای اصلی (PCA)، دانشی که درون دادههای شرایط عملیاتی نرمال توربین پنهان بوده استخراج و با استفاده از آن به تشخیص و شناسایی عیب توربین گاز پرداخته شده است. در مرحله بعد، با بهکارگیری ابزار ماشین بردار پشتیبان (...
متن کاملImproved Cluster Partition in Principal Component Analysis Guided Clustering
Principal component analysis (PCA) guided clustering approach is widely used in high dimensional data to improve the efficiency of Kmeans cluster solutions. Typically, Pearson correlation is used in PCA to provide an eigenanalysis to obtain the associated components that account for most of the variations in the data. However, PCA based Pearson correlation can be sensitive on non-Gaussian distr...
متن کاملA GUIDED TABU SEARCH FOR PROFILE OPTIMIZATION OF FINITE ELEMENT MODELS
In this paper a Guided Tabu Search (GTS) is utilized for optimal nodal ordering of finite element models (FEMs) leading to small profile for the stiffness matrices of the models. The search strategy is accelerated and a graph-theoretical approach is used as guidance. The method is evaluated by minimization of graph matrices pattern equivalent to stiffness matrices of finite element models. Comp...
متن کاملSpecial Issue on Recommendation and Search in Social Systems
The open nature of collaborative recommender systems allows attackers who inject biased profile data to have a significant impact on the recommendations produced. Standard memory-based collaborative filtering algorithms, such as k-nearest neighbor, are quite vulnerable to profile injection attacks. Previous work has shown that some model-based techniques are more robust than standard k-nn. Mode...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition Letters
دوره 54 شماره
صفحات -
تاریخ انتشار 2015