Map/Reduce Affinity Propagation Clustering Algorithm
نویسندگان
چکیده
The Affinity Propagation (AP) is a clustering algorithm that does not require pre-set K cluster numbers. We improve the original AP to Map/Reduce Affinity Propagation (MRAP) implemented in Hadoop, a distribute cloud environment. The architecture of MRAP is divided to multiple mappers and one reducer in Hadoop. In the experiments, we compare the clustering result of the proposed MRAP with the K-means method. The experiment results support that the proposed MRAP method has good performance in terms of accuracy and Davies–Bouldin index value. Also, by applying the proposed MRAP method can reduce the number of iterations before convergence for the K-means method irrespective to the data dimensions.
منابع مشابه
A New Knowledge-Based System for Diagnosis of Breast Cancer by a combination of the Affinity Propagation and Firefly Algorithms
Breast cancer has become a widespread disease around the world in young women. Expert systems, developed by data mining techniques, are valuable tools in diagnosis of breast cancer and can help physicians for decision making process. This paper presents a new hybrid data mining approach to classify two groups of breast cancer patients (malignant and benign). The proposed approach, AP-AMBFA, con...
متن کاملA parallel attribute reduction algorithm based on Affinity Propagation clustering
As information technology is developing rapidly, massive and high dimensional data sets have appeared in abundance. The existing attribute reduction methods are encountering bottleneck problem of timeliness and spatiality. AP(Affinity Propagation) is an efficient and fast clustering algorithm for large dataset compared with the existing clustering algorithms. This paper discusses attribute clus...
متن کاملPartition Affinity Propagation for Clustering Large Scale of Data in Digital Library
Data clustering is very useful in helping users visit the large scale of data in digit library. In this paper, we present an improved algorithm for clustering large scale of data set with dense relationship based on Affinity Propagation. First, the input data are divided into several groups and Affinity Propagation is applied to them respectively. Results from first step are grouped together in...
متن کاملA Graph Clustering Algorithm Providing Scalability
Based on the current studies on the algorithms of the affinity propagation and normalized cut, a new scalable graph clustering method called APANC (Affinity Propagation And Normalized Cut) is proposed in this paper. During the APANC process, we firstly use the “Affinity Propagation” (AP) to preliminarily group the original data in order to reduce the data-scale, and then we further group the re...
متن کاملA Survey On Seeds Affinity Propagation
Affinity propagation (AP) is a clustering method that can find data centers or clusters by sending messages between pairs of data points. Seed Affinity Propagation is a novel semisupervised text clustering algorithm which is based on AP. AP algorithm couldn’t cope up with part known data direct. Therefore, focusing on this issue a semi-supervised scheme called incremental affinity propagation c...
متن کامل