An Efficient Incremental Algorithm to Mine Closed Frequent Itemsets over Data Streams
نویسندگان
چکیده
The purpose of this work is to mine closed frequent itemsets from transactional data streams using a sliding window model. An efficient algorithm IMCFI is proposed for Incremental Mining of Closed Frequent Itemsets from a transactional data stream. The proposed algorithm IMCFI uses a data structure called INdexed Tree(INT) similar to NewCET used in NewMoment[5]. INT contains an index table ItemSet Reference List (ISRList) which contains information about the locations of closed frequent itemsets stored in the summary data structure. ISRList helps in quick searching of closed frequent itemset already generated which in turn reduces the overall time required to mine new closed frequent itemsets from a sliding window. Experiments show that IMCFI is time and space efficient when compared to NewMoment.
منابع مشابه
An Efficient Algorithm to Mine Online Data Streams
Mining frequent closed itemsets provides complete and condensed information for non-redundant association rules generation. Extensive studies have been done on mining frequent closed itemsets, but they are mainly intended for traditional transaction databases and thus do not take data stream characteristics into consideration. In this paper, we propose a novel approach for mining closed frequen...
متن کاملIncremental updates of closed frequent itemsets over continuous data streams
Online mining of closed frequent itemsets over streaming data is one of the most important issues in mining data streams. In this paper, we propose an efficient one-pass algorithm, NewMoment to maintain the set of closed frequent itemsets in data streams with a transaction-sensitive sliding window. An effective bit-sequence representation of items is used in the proposed algorithm to reduce the...
متن کاملMining Frequent Itemsets with Normalized Weight in Continuous Data Streams
A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. The continuous characteristic of streaming data necessitates the use of algorithms that require only one scan over the stream for knowledge discovery. Data mining over data streams should support the flexible trade-off between processing time and mining accuracy. In many application areas, min...
متن کاملSuffixMiner: Efficiently Mining Frequent Itemsets in Data Streams by Suffix-Forest
We proposed a new algorithm SuffixMiner which eliminates the requirement of multiple passes through the data when finding out all frequent itemsets in data streams, takes full advantage of the special property of suffixtree to avoid generating candidate itemsets and traversing each suffix-tree during the itemset growth, and utilizes a new itemset growth method to mine all frequent itemsets in d...
متن کاملMining Top-k Frequent Closed Itemsets in Data Streams Using Sliding Window
Frequent itemset mining has become a popular research area in data mining community since the last few years. There are two main technical hitches while finding frequent itemsets. First, to provide an appropriate minimum support value to start and user need to tune this minimum support value by running the algorithm again and again. Secondly, generated frequent itemsets are mostly numerous and ...
متن کامل