Efficient mining method for retrieving sequential patterns over online data streams

نویسندگان

  • Joong Hyuk Chang
  • Won Suk Lee
چکیده

With the usefulness of data mining in various fields of information science, various mining methods have been proposed in previous research. Recently, in these fields, data has taken the form of continuous data streams rather than finite stored data sets. In this paper, a mining method of sequential patterns over an online sequence data stream is proposed, which is useful for retrieving embedded knowledge in the data stream. The proposed method can minimize memory usage of the mining process while an error is allowed in its mining result, and supports flexible trade-off between memory usage and mining accuracy. However, the error is minimized by an accurate estimation method for the count of a sequence, which considers the ordering information of items. The proposed method can catch a recent change in a sequence data stream in a short time, by a decaying mechanism gracefully discarding old information that may be no longer useful.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficiently Mining High Utility Sequential Patterns in Static and Streaming Data

High utility sequential pattern (HUSP) mining has emerged as a novel topic in data mining. Although some preliminary works have been conducted on this topic, they incur the problem of producing a large search space for high utility sequential patterns. In addition, they mainly focus on mining HUSPs in static databases and do not take streaming data into account, where unbounded data come contin...

متن کامل

Efficient Mining of High Utility Sequential Patterns Over Data Streams

High utility sequential pattern mining has emerged as an important topic in data mining. Although several preliminary works have been conducted on this topic, the existing studies mainly focus on mining high utility sequential patterns (HUSPs) in static databases and do not consider the streaming data. Mining HUSPs over data streams is very desirable for many applications. However, addressing t...

متن کامل

A Single-scan Algorithm for Mining Sequential Patterns from Data Streams

Sequential pattern mining (SPAM) is one of the most interesting research issues of data mining. In this paper, a new research problem of mining data streams for sequential patterns is defined. A data stream is an unbound sequence of data elements arriving at a rapid rate. Based on the characteristics of data streams, the problem complexity of mining data streams for sequential patterns is more ...

متن کامل

Need For Speed : Mining Sequential Patterns in Data Streams

Recently, the data mining community has focused on a new challenging model where data arrives sequentially in the form of continuous rapid streams. It is often referred to as data streams or streaming data. Many real-world applications data are more appropriately handled by the data stream model than by traditional static databases. Such applications can be: stock tickers, network traffic measu...

متن کامل

SPAMS: A Novel Incremental Approach for Sequential Pattern Mining in Data Streams

Mining sequential patterns in data streams is a new challenging problem for the datamining community since data arrives sequentially in the form of continuous rapid and infinite streams. In this paper, we propose a new on-line algorithm, SPAMS, to deal with the sequential patterns mining problem in data streams. This algorithm uses an automaton-based structure to maintain the set of frequent se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Information Science

دوره 31  شماره 

صفحات  -

تاریخ انتشار 2005