Concurrent Processing of Frequent Itemset Queries Using FP-Growth Algorithm
نویسندگان
چکیده
Discovery of frequent itemsets is a very important data mining problem with numerous applications. Frequent itemset mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. A significant amount of research on frequent itemset mining has been done so far, focusing mainly on developing faster complete mining algorithms, efficient constraint handling, and reusing results of previous queries. Recently, a new problem of optimizing processing of batches of frequent itemset queries has been considered and two multiple query optimization techniques for frequent itemset queries: Common Counting and Mine Merge have been proposed. Mine Merge does not depend on a particular mining algorithm, while Common Counting has been specifically designed to work with Apriori. Nevertheless, in previous works the efficiency of Mine Merge was tested only on Apriori, and it is unclear how it would perform with newer pattern-growth algorithms like FP-growth. In this paper we adapt the Common Counting method to work with FP-growth and evaluate efficiency of both methods when FP-growth is used as a basic mining algorithm.
منابع مشابه
Three Strategies for Concurrent Processing of Frequent Itemset Queries Using FP-Growth
Frequent itemset mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. Recently, a new problem of optimizing processing of sets of frequent itemset queries has been considered and two multiple query optimization techniques for frequent itemset queries: Mine Merge and Common Counting have been proposed and ...
متن کاملIntegration of candidate hash trees in concurrent processing of frequent itemset queries using Apriori
In this paper we address the problem of processing of batches of frequent itemset queries using the Apriori algorithm. The best solution of this problem proposed so far is Common Counting, which consists in concurrent execution of the queries using Apriori with the integration of scans of the parts of the database shared among the queries. In this paper we propose a new method – Common Candidat...
متن کاملControl and Cybernetics Integration of Candidate Hash Trees in Concurrent Processing of Frequent Itemset Queries Using Apriori *
Abstract: Frequent itemset mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. In this paper we address the problem of processing batches of frequent itemset queries using the Apriori algorithm. The best solution of this problem proposed so far is Common Counting, which consists in concurrent execution o...
متن کاملAccelerating Closed Frequent Itemset Mining by Elimination of Null Transactions
The mining of frequent itemsets is often challenged by the length of the patterns mined and also by the number of transactions considered for the mining process. Another acute challenge that concerns the performance of any association rule mining algorithm is the presence of „null‟ transactions. This work proposes a closed frequent itemset mining algorithm viz., Closed Frequent Itemset Mining a...
متن کاملComparative Study of Frequent Itemset Mining Algorithms Apriori and FP Growth
Frequent itemset mining leads to the discovery of associations among items in large transactional database. In this paper, two algorithms[7] of generating frequent itemsets are discussed: Apriori and FP-growth algorithm. In apriori algorithm candidates are generated and testing is done which is easy to implement but candidate generation and support counting is very expensive in this because dat...
متن کامل