PTree: Mining Sequential Patterns Efficiently in Multiple Data Streams Environment

نویسندگان

  • Guanling Lee
  • Yi-Chun Chen
  • Kuo-Che Hung
چکیده

Although issues of data streams have been widely studied and utilized, it is nevertheless challenging to deal with sequential mining of data streams. In this paper, we assume that the transaction of a user is partially coming and that there is no auxiliary for buffering and integrating. We adopt the Path Tree for mining frequent sequential patterns over data streams and integrate the user’s sequences efficiently. Algorithms with regards to accuracy (PAlgorithm) and space (PSAlgorithm) are proposed to meet the different aspects of users, although GAlgorithm for mining frequent sequential patterns with a gap limitation is proposed. Many pruning properties are used to further reduce the space usage and improve the accuracy of our algorithms. We also prove that PAlgorithm mine frequent sequential patterns with the approximate support of error guarantee. Through thoughtful experiments, synthetic and real datasets are utilized to verify the feasibility of our algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental Mining of Across-streams Sequential Patterns in Multiple Data Streams

Sequential pattern mining is the mining of data sequences for frequent sequential patterns with time sequence, which has a wide application. Data streams are streams of data that arrive at high speed. Due to the limitation of memory capacity and the need of real-time mining, the results of mining need to be updated in real time. Multiple data streams are the simultaneous arrival of a plurality ...

متن کامل

Incremental Mining of Closed Sequential Patterns in Multiple Data Streams

Sequential pattern mining searches for the relative sequence of events, allowing users to make predictions on discovered sequential patterns. Due to drastically advanced information technology over recent years, data have rapidly changed, growth in data amount has exploded and real-time demand is increasing, leading to the data stream environment. Data in this environment cannot be fully stored...

متن کامل

Efficiently Mining High Utility Sequential Patterns in Static and Streaming Data

High utility sequential pattern (HUSP) mining has emerged as a novel topic in data mining. Although some preliminary works have been conducted on this topic, they incur the problem of producing a large search space for high utility sequential patterns. In addition, they mainly focus on mining HUSPs in static databases and do not take streaming data into account, where unbounded data come contin...

متن کامل

Sequential Pattern Mining of Multimodal Streams in the Humanities

Research in the humanities is increasingly attracted by data mining and data management techniques in order to efficiently deal with complex scientific corpora. Particularly, the exploration of hidden patterns within different types of data streams arising from psycholinguistic experiments is of growing interest in the area of translation process research. In order to support psycholinguistic e...

متن کامل

Mining Compressing Patterns in a Data Stream

Mining patterns that compress the data well was shown to be an effective approach for extracting meaningful patterns and solving the redundancy issue in frequent pattern mining. Most of the existing works in the literature consider mining compressing patterns from a static database of itemsets or sequences. These approaches require multiple passes through the data and do not scale up with the s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Inf. Sci. Eng.

دوره 29  شماره 

صفحات  -

تاریخ انتشار 2013