Frequent Itemsets Mining with VIL - Tree Algorithm
نویسنده
چکیده
The aim of this paper is to develop a new mining algorithm to mine all frequent itemsets from a transaction database called the vertical index list (VIL) tree algorithm. The main advantages of the previous algorithms, which are frequent pattern (FP) growth and inverted index structure (IIS) mine, are still useful in a new approach as database scanning only done once, and all frequent itemsets are mined without generating candidate itemsets and the changing in the minimal support threshold is not affected the data structure. IIS Mine was proposed to reduce the recursive mining steps, nodes construction, and the size of the trees. However, IIS mine has some drawbacks when the small transaction sets are contributed to early trees, so sub trees are of the result. To overcome this problem, VIL Tree has been proposed to mine large transaction sets to get the early long size of frequent itemsets. This is useful when many subsets of frequent itemsets are found, and from it recursive mining steps, nodes, and sub trees are reduced. The performance of VIL Tree has been tested with reference to FP growth and IIS Mine. The experimental results demonstrate that VIL Tree provides better performance than the two comparative algorithms in terms of run time and space consumption.
منابع مشابه
A Novel Data Mining Method to Find the Frequent Patterns from Predefined Itemsets in Huge Dataset Using TMPIFPMM
Abstract-Association rule mining is one of the important data mining techniques. It finds correlations among attributes in huge dataset. Those correlations are used to improve the strategy of the future business. The core process of association rule mining is to find the frequent patterns (itemsets) in huge dataset. Countless algorithms are available in the literature to find the frequent items...
متن کاملOptimization Of Intersecting Algorithm For Transactions Of Closed Frequent Item Sets In Data Mining
Data mining is the computer-assisted process of information analysis. Mining frequent itemsets is a fundamental task in data mining. Unfortunately the number of frequent itemsets describing the data is often too large to comprehend. This problem has been attacked by condensed representations of frequent itemsets that are sub collections of frequent itemsets containing only the frequent itemsets...
متن کاملMining Frequent Itemsets with Normalized Weight in Continuous Data Streams
A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. The continuous characteristic of streaming data necessitates the use of algorithms that require only one scan over the stream for knowledge discovery. Data mining over data streams should support the flexible trade-off between processing time and mining accuracy. In many application areas, min...
متن کاملDiscovery of Frequent Itemsets: Frequent Item Tree-Based Approach
Mining frequent patterns in large transactional databases is a highly researched area in the field of data mining. Existing frequent pattern discovering algorithms suffer from many problems regarding the high memory dependency when mining large amount of data, computational and I/O cost. Additionally, the recursive mining process to mine these structures is also too voracious in memory resource...
متن کاملMaximal Frequent Itemsets Mining Using Database Encoding
Frequent itemsets mining is a classic problem in data mining and plays an important role in data mining research for over a decade. However, the mining of the all frequent itemsets will lead to a massive number of itemsets. Fortunately, this problem can be reduced to the mining of maximal frequent itemsets. In this paper, we propose a new method for mining maximal frequent itemsets. Our method ...
متن کامل