Generating (Fuzzy) Frequent Itemsets by a Bitmap-based Algorithm – the Word’s Most Compact Frequent Itemset Miner
نویسندگان
چکیده
Mining frequent itemsets in databases is an important and widely studied problem in data mining research. The problem of mining frequent itemsets is usually solved by constructing candidates of itemsets, and identifying those itemsets that meet the requirement of frequent itemsets. This paper proposes a novel algorithm based on BitTable (or bitmap) representation of the data. Data related to frequent itemsets are stored in spare matrices. Simple matrix and vector multiplications are used to calculate the support of the potential n+1 itemsets. The main benefit of this approach is that only bitmaps of the frequent itemsets are generated. The concept is simple and easily interpretable and it supports a compact and effective implementation (in MATLAB). An application example related to the BMS-WebView-1 benchmark data is presented to illustrate the applicability of the proposed algorithm.
منابع مشابه
Mining Frequent Sequences Using Itemset-Based Extension
In this paper, we systematically explore an itemset-based extension approach for generating candidate sequence which contributes to a better and more straightforward search space traversal performance than traditional item-based extension approach. Based on this candidate generation approach, we present FINDER, a novel algorithm for discovering the set of all frequent sequences. FINDER is compo...
متن کاملMINING FUZZY TEMPORAL ITEMSETS WITHIN VARIOUS TIME INTERVALS IN QUANTITATIVE DATASETS
This research aims at proposing a new method for discovering frequent temporal itemsets in continuous subsets of a dataset with quantitative transactions. It is important to note that although these temporal itemsets may have relatively high textit{support} or occurrence within particular time intervals, they do not necessarily get similar textit{support} across the whole dataset, which makes i...
متن کاملA Fuzzy Algorithm for Mining High Utility Rare Itemsets – FHURI
Classical frequent itemset mining identifies frequent itemsets in transaction databases using only frequency of item occurrences, without considering utility of items. In many real world situations, utility of itemsets are based upon user’s perspective such as cost, profit or revenue and are of significant importance. Utility mining considers using utility factors in data mining tasks. Utility-...
متن کاملA fast Algorithm for mining fuzzy frequent itemsets
In this paper, a fuzzy frequent itemset (FFI)-Miner algorithm is developed to mine the complete set of FFIs without candidate generation. It uses a novel fuzzy-list structure to keep the essential information for later mining process. An efficient pruning strategy is also developed to reduce the search space, thus speeding up the mining process to directly discover the FFIs. Experiments are con...
متن کاملLCM over ZBDDs: Fast Generation of Very Large-Scale Frequent Itemsets Using a Compact Graph-Based Representation
(Abstract) Frequent itemset mining is one of the fundamental techniques for data mining and knowledge discovery. In the last decade, a number of efficient algorithms for frequent itemset mining have been presented, but most of them focused on just enumerating the itemsets which satisfy the given conditions, and it was a different matter how to store and index the mining result for efficient dat...
متن کامل