Sequential Pattern Mining Algorithms: Trade-offs between Speed and Memory
نویسندگان
چکیده
Increased application of structured pattern mining requires a perfect understanding of the problem and a clear identification of the advantages and disadvantages of existing algorithms. Among those algorithms, pattern-growth methods have been shown to have the best performance when applied to sequential pattern mining. However, their advantages over apriori-based methods are not well explained and understood. Detailed analysis of the performance and memory requirements for these algorithms shows that counting the support for each potential pattern is the most computationally demanding step. Additionally, the analysis makes clear that the main advantage of patterngrowth over apriori-based methods resides on the restriction of the search space that is obtained from the creation of projected databases. In this paper, we present this analysis and describe how apriori-based algorithms can achieve the efficiency of pattern-growth methods.
منابع مشابه
Stream ciphers and the eSTREAM project
Stream ciphers are an important class of symmetric cryptographic algorithms. The eSTREAM project contributed significantly to the recent increase of activity in this field. In this paper, we present a survey of the eSTREAM project. We also review recent time/memory/data and time/memory/key trade-offs relevant for the generic attacks on stream ciphers.
متن کاملMining Sequential Patterns with Regular Expression Constraints
ÐDiscovering sequential patterns is an important problem in data mining with a host of application domains including medicine, telecommunications, and the World Wide Web. Conventional sequential pattern mining systems provide users with only a very restricted mechanism (based on minimum support) for specifying patterns of interest. As a consequence, the pattern mining process is typically chara...
متن کاملA New Algorithm for High Average-utility Itemset Mining
High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...
متن کاملMining Algorithms for Sequential Patterns in Parallel: Hash Based Approach
In this paper, we study the problem of mining sequential patterns in a large database of customer transactions. Since nding sequential patterns has to handle a large amount of customer transaction data and requires multiple passes over the database, it is expected that parallel algorithms help to improve the performance signi cantly. We consider the parallel algorithms for mining sequential pat...
متن کاملEfficient Sequential Pattern Mining Algorithms
Sequential pattern mining is a heavily researched area in the field of data mining with wide variety of applications. The task of discovering frequent sequences is challenging, because the algorithm needs to process a combinatorially explosive number of possible sequences. Most of the methods dealing with the sequential pattern mining problem are based on the approach of the traditional task of...
متن کامل