Improved Constrained k-Means Algorithm for Clustering with Domain Knowledge
نویسندگان
چکیده
Witnessing the tremendous development of machine learning technology, emerging applications impose challenges using domain knowledge to improve accuracy clustering provided that suffers a compromising rate despite its advantage fast procession. In this paper, we model (i.e., background or side information), respecting some as must-link and cannot-link sets, for sake collaborating with k-means better accuracy. We first propose an algorithm constrained k-means, considering only must-links. The key idea is consider set data points by must-links single point weight equal sum points. Then, cannot-link, employ minimum-weight matching assign existing clusters. At last, carried out numerical simulation evaluate proposed algorithms against UCI datasets, demonstrating our method outperforms previous well traditional regarding although slightly compromised practical runtime.
منابع مشابه
An Improved K-Means with Artificial Bee Colony Algorithm for Clustering Crimes
Crime detection is one of the major issues in the field of criminology. In fact, criminology includes knowing the details of a crime and its intangible relations with the offender. In spite of the enormous amount of data on offenses and offenders, and the complex and intangible semantic relationships between this information, criminology has become one of the most important areas in the field o...
متن کاملConstrained K-means Clustering with Background Knowledge
Clustering is traditionally viewed as an unsupervised method for data analysis. However, in some cases information about the problem domain is available in addition to the data instances themselves. In this paper, we demonstrate how the popular k-means clustering algorithm can be profitably modified to make use of this information. In experiments with artificial constraints on six data sets, we...
متن کاملClustering Using Boosted Constrained k-Means Algorithm
This article proposes a constrained clustering algorithmwith competitive performance and less computation time to the state-of-the-art methods, which consists of a constrained k-means algorithm enhanced by the boosting principle. Constrained k-means clustering using constraints as background knowledge, although easy to implement and quick, has insufficient performance compared with metric learn...
متن کاملConstrained clustering with k-means
We introduce a k−means type clustering in the presence of cannot–link and must–link constraints. First we apply a BIRCH type methodology to eliminate must–link constraints. Next we introduce a penalty function to substitute cannot–link constraints. When penalty values increase to +∞ the original cannot–link constraints are recovered. The preliminary numerical experiments show that constraints h...
متن کاملConstrained K-Means Clustering
We consider practical methods for adding constraints to the K-Means clustering algorithm in order to avoid local solutions with empty clusters or clusters having very few points. We often observe this phenomena when applying K-Means to datasets where the number of dimensions is n 10 and the number of desired clusters is k 20. We propose explicitly adding k constraints to the underlying clusteri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Mathematics
سال: 2021
ISSN: ['2227-7390']
DOI: https://doi.org/10.3390/math9192390