Mining Uncertain Sequential Patterns in Iterative MapReduce
نویسندگان
چکیده
This paper proposes a sequential pattern mining (SPM) algorithm in large scale uncertain databases. Uncertain sequence databases are widely used to model inaccurate or imprecise timestamped data in many real applications, where traditional SPM algorithms are inapplicable because of data uncertainty and scalability. In this paper, we develop an efficient approach to manage data uncertainty in SPM and design an iterative MapReduce framework to execute the uncertain SPM algorithm in parallel. We conduct extensive experiments in both synthetic and real uncertain datasets. And the experimental results prove that our algorithm is efficient and scalable.
منابع مشابه
An Efficient Implementation of Chronic Inflation based Power Iterative Clustering Algorithm
---------------------------------------------------------------------***--------------------------------------------------------------------Abstract -In recent days the software plays a major role in application writing areas such that big data, it is used to store the information which is efficiently processed. Managing large datasets is one of the challenging task and also time consumption is...
متن کاملLarge-Scale Image Classification using High Performance Clustering
Computer vision is being revolutionized by the incredible volume of visual data available on the Internet. A key part of computer vision, data mining, or any Big Data problem is the analytics that transforms this raw data into understanding, and many of these data mining approaches require iterative computations at an unprecedented scale. Often an individual iteration can be specified as a MapR...
متن کاملParallel Sequential Pattern Mining of Massive Trajectory Data
The trajectory pattern mining problem has recently attracted much attention due to the rapid development of location-acquisition technologies, and parallel computing essentially provides an alternative method for handling this problem. This study precisely addresses the problem of parallel mining of trajectory sequential patterns based on the newly proposed concepts with regard to trajectory pa...
متن کاملMining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows
Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...
متن کاملSequential Pattern Mining for Uncertain Data Streams using Sequential Sketch
Uncertainty is inherent in data streams, and present new challenges to data streams mining. For continuous arriving and large size of data streams, modeling sequences of uncertain time series data streams require significantly more space. Therefore, it is important to construct compressed representation for storing uncertain time series data. Based on granules, sequential sketches are created t...
متن کامل