SuffixMiner: Efficiently Mining Frequent Itemsets in Data Streams by Suffix-Forest
نویسندگان
چکیده
We proposed a new algorithm SuffixMiner which eliminates the requirement of multiple passes through the data when finding out all frequent itemsets in data streams, takes full advantage of the special property of suffixtree to avoid generating candidate itemsets and traversing each suffix-tree during the itemset growth, and utilizes a new itemset growth method to mine all frequent itemsets in data streams. Experiment results show that the SuffixMiner algorithm not only has an excellent scalability to mine frequent itemsets over data streams, but also outperforms Apriori and Fp-Growth algorithms.
منابع مشابه
CLAIM: An Efficient Method for Relaxed Frequent Closed Itemsets Mining over Stream Data
Recently, frequent itemsets mining over data streams attracted much attention. However, mining closed itemsets from data stream has not been well addressed. The main difficulty lies in its high complexity of maintenance aroused by the exact model definition of closed itemsets and the dynamic changing of data streams. In data stream scenario, it is sufficient to mining only approximated frequent...
متن کاملEfficient mining of temporal emerging itemsets from data streams
In this paper, we propose a new method, namely EFI-Mine, for mining temporal emerging frequent itemsets from data streams efficiently and effectively. The temporal emerging frequent itemsets are those that are infrequent in the current time window of data stream but have high potential to become frequent in the subsequent time windows. Discovery of emerging frequent itemsets is an important pro...
متن کاملTop-k-FCI: Mining Top-K Frequent Closed Itemsets in Data Streams
With the generation and analysis of stream data, such as network monitoring in real time, log records, click streams, a great deal of attention has been concerned on data streams mining in the field of data mining. In the process of the data streams mining, it is more reasonable to ask users to set a bound on the result size. Therefore, in this paper, an real-time single-pass algorithm, called ...
متن کاملConcept Shift Detection for Frequent Itemsets from Sliding Windows over Data Streams
In a mobile business collaboration environment, frequent itemsets analysis will discover the noticeable associated events and data to provide important information of user behaviors. Many algorithms have been proposed for mining frequent itemsets over data streams. However, in many practical situations where the data arrival rate is very high, continuous mining the data sets within a sliding wi...
متن کاملA Simple but Effective Maximal Frequent Itemset Mining Algorithm over Streams
Maximal frequent itemsets are one of several condensed representations of frequent itemsets, which store most of the information contained in frequent itemsets using less space, thus being more suitable for stream mining. This paper considers a simple but effective algorithm for mining maximal frequent itemsets over a stream landmark. We design a compact data structure named FP-FOREST to improv...
متن کامل