A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach

Authors

Abstract:

In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences, and data mining techniques provide useful solutions to solve this problem. Nowadays, clustering technique as the most widely used function of data mining, has attracted the attention of many researchers in various sciences. Due to different applications, the problem of clustering time series data has become highly popular and many approaches have been presented in this field. An efficient clustering method groups data in such a way that the objects in the same cluster are more similar to each other than to objects in different clusters. In order to compute the difference/similarity between time series data in clustering process, a similarity measure or distance function is used. Therefore, choosing an appropriate distance function is one of the most important challenges that should be considered before starting the clustering process. So far, various distance functions have been proposed to measure the difference/similarity between time series and each of them have its own strengths and weaknesses. Since choosing a suitable distance function to cluster a specific data set is a complicated process, in this study, we proposed a clustering method based on combination of the well-known Fuzzy C-Means (FCM) method and the Particle Swarm Optimization with the ability of using different distance functions in time series clustering process. In this way, the step of choosing the best distance function before starting time series clustering procedure has been deleted and different similarity measures can participate in the clustering process with different impacts. The objective function in this study is defined based on Fuzzy C-Means clustering objective function and the particle Swarm Optimization algorithm is used to find the optimal value for the considered objective function. Finally, by considering three distance functions including Euclidean distance, dynamic time warping and Pearson correlation coefficients the proposed method was implemented on seven well-known UCR time series datasets. Also, by considering the average normalized mutual information as a criterion for evaluating the performance of methods in this research, the proposed method was compared with five other methods. The results of this comparison indicated that the method presented in this study performed better in more than 85% of cases rather than other methods. In order to have a better evaluation, Tukey’s multiple comparison tests with a threshold of p < 0.05 is used with the ability of comparing the methods in pairs. The results obtained by Tukey test showed that, in about 83% of cases, the difference between achieved results by the proposed method in this study and results obtained by the other five techniques are statistically significant. Overall, the results of this study clearly showed the superiority of the proposed clustering method in the production of high quality clusters in comparison to some other methods.

Upgrade to premium to download articles

Sign up to access the full text

Already have an account?login

similar resources

OPTIMIZATION OF FUZZY CLUSTERING CRITERIA BY A HYBRID PSO AND FUZZY C-MEANS CLUSTERING ALGORITHM

This paper presents an efficient hybrid method, namely fuzzy particleswarm optimization (FPSO) and fuzzy c-means (FCM) algorithms, to solve the fuzzyclustering problem, especially for large sizes. When the problem becomes large, theFCM algorithm may result in uneven distribution of data, making it difficult to findan optimal solution in reasonable amount of time. The PSO algorithm does find ago...

full text

optimization of fuzzy clustering criteria by a hybrid pso and fuzzy c-means clustering algorithm

this paper presents an efficient hybrid method, namely fuzzy particleswarm optimization (fpso) and fuzzy c-means (fcm) algorithms, to solve the fuzzyclustering problem, especially for large sizes. when the problem becomes large, thefcm algorithm may result in uneven distribution of data, making it difficult to findan optimal solution in reasonable amount of time. the pso algorithm does find ago...

full text

An improved fuzzy C-means clustering algorithm based on PSO

To deal with the problem of premature convergence of the fuzzy c-means clustering algorithm based on particle swarm optimization, which is sensitive to noise and less effective when handling the data set that dimensions greater than the number of samples, a novel fuzzy c-means clustering method based on the enhanced Particle Swarm Optimization algorithm is presented. Firstly, this approach dist...

full text

A Fuzzy C-means Algorithm for Clustering Fuzzy Data and Its Application in Clustering Incomplete Data

The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is pres...

full text

ADAPTIVE NEURO FUZZY INFERENCE SYSTEM BASED ON FUZZY C–MEANS CLUSTERING ALGORITHM, A TECHNIQUE FOR ESTIMATION OF TBM PENETRATION RATE

The  tunnel  boring  machine  (TBM)  penetration  rate  estimation  is  one  of  the  crucial  and complex  tasks  encountered  frequently  to  excavate  the  mechanical  tunnels.  Estimating  the machine  penetration  rate  may  reduce  the  risks  related  to  high  capital  costs  typical  for excavation  operation.  Thus  establishing  a  relationship  between  rock  properties  and  TBM pe...

full text

Fuzzy Time Series Forecasting Based On K-Means Clustering

Many forecasting models based on the concepts of Fuzzy time series have been proposed in the past decades. These models have been widely applied to various problem domains, especially in dealing with forecasting problems in which historical data are linguistic values. In this paper, we present a new fuzzy time series forecasting model, which uses the historical data as the universe of discourse...

full text

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}


Journal title

volume 10  issue 2

pages  23- 37

publication date 2020-12

By following a journal you will be notified via email when a new issue of this journal is published.

Keywords

No Keywords

Hosted on Doprax cloud platform doprax.com

copyright © 2015-2023