Efficient Mining of High Utility Patterns over Data Streams with a Sliding Window Method
نویسندگان
چکیده
High utility pattern (HUP) mining over data streams has become a challenging research issue in data mining. The existing sliding window-based HUP mining algorithms over stream data suffer from the level-wise candidate generationand-test problem. Therefore, they need a large amount of execution time and memory. Moreover, their data structures are not suitable for interactive mining. To solve these problems of the existing algorithms, in this paper, we propose a new tree structure, called HUS-tree (High Utility Stream tree) and a novel algorithm, called HUPMS (HUP Mining over Stream data), for sliding window-based HUP mining over data streams. By capturing the important information of the stream data into an HUS-tree, our HUPMS algorithm can mine all the HUPs in the current window with a pattern growth approach. Moreover, HUS-tree is very efficient for interactive mining. Extensive performance analyses show that our algorithm significantly outperforms the existing sliding window-based HUP mining algorithms.
منابع مشابه
Mining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows
Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...
متن کاملFrequent Patterns Mining over Data Stream Using an Efficient Tree Structure
Mining frequent patterns over data streams is an interesting problem due to its wide application area. In this study, a novel method for sliding window frequent patterns mining over data streams is proposed. This method utilizes a compressed and memory efficient tree data structure to store and to maintain sliding window transactions. The method dynamically reconstructs and compresses tree data...
متن کاملEfficient Mining of High Utility Sequential Patterns Over Data Streams
High utility sequential pattern mining has emerged as an important topic in data mining. Although several preliminary works have been conducted on this topic, the existing studies mainly focus on mining high utility sequential patterns (HUSPs) in static databases and do not consider the streaming data. Mining HUSPs over data streams is very desirable for many applications. However, addressing t...
متن کاملData sanitization in association rule mining based on impact factor
Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...
متن کاملA Single-scan Algorithm for Mining Sequential Patterns from Data Streams
Sequential pattern mining (SPAM) is one of the most interesting research issues of data mining. In this paper, a new research problem of mining data streams for sequential patterns is defined. A data stream is an unbound sequence of data elements arriving at a rapid rate. Based on the characteristics of data streams, the problem complexity of mining data streams for sequential patterns is more ...
متن کامل