Parallel Mining of Frequent Maximal Itemsets Using Order Preserving Generators
نویسندگان
چکیده
In this paper, we propose a parallel algorithm for mining maximal itemsets. We propose POP-MAX (Parallel Order Preserving MAXimal itemset algorithm), a fast and memory efficient parallel algorithm which enumerates all the maximal patterns concurrently and independently across several nodes. Also, POP-MAX uses an efficient maximality checking technique which determines the maximality of an itemset using less number of items. To enhance the load sharing among different nodes, we have used round robin strategy which achieves load balancing as high as 90%. We have also incorporated bit-vectors and numerous optimizations to reduce the memory consumption and overall running time of the algorithm. Our comprehensive experimental analyses involving both real and synthetic datasets show that our algorithm takes less memory and less running time than other maximal itemset mining algorithms.
منابع مشابه
Memory Efficient Mining of Maximal Itemsets using Order Preserving Generators
In this paper, we propose a memory efficient algorithm for maximal frequent itemset mining from transactional datasets. We propose OP-MAX* (Order Preserving – MAXimal itemset mining) algorithm, which mines all the maximal itemsets from transactional datasets with less space and time. Our methodology uses a memory efficient maximality checking technique to generate frequent maximal itemsets. We ...
متن کاملSimultaneous mining of frequent closed itemsets and their generators: Foundation and algorithm
Closed itemsets and their generators play an important role in frequent itemset and association rule mining. They allow a lossless representation of all frequent itemsets and association rules and facilitate mining. Some recent approaches discover frequent closed itemsets and generators separately. The Close algorithm mines them simultaneously but it needs to scan the database many times. Based...
متن کاملMaximal frequent itemset generation using segmentation approach
Finding frequent itemsets in a data source is a fundamental operation behind Association Rule Mining. Generally, many algorithms use either the bottom-up or top-down approaches for finding these frequent itemsets. When the length of frequent itemsets to be found is large, the traditional algorithms find all the frequent itemsets from 1-length to n-length, which is a difficult process. This prob...
متن کاملAn Algorithm for Mining High Utility Closed Itemsets and Generators
Traditional association rule mining based on the support-confidence framework provides the objective measure of the rules that are of interest to users. However, it does not reflect the utility of the rules. To extract non-redundant association rules in support-confidence framework frequent closed itemsets and their generators play an important role. To extract non-redundant association rules a...
متن کاملData sanitization in association rule mining based on impact factor
Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...
متن کامل