Adaptive Load Shedding for Mining Frequent Patterns from Data Streams

نویسندگان

  • Xuan-Hong Dang
  • Wee Keong Ng
  • Kok-Leong Ong
چکیده

Most algorithms that focus on discovering frequent patterns from data streams assumed that the machinery is capable of managing all the incoming transactions without any delay; or without the need to drop transactions. However, this assumption is often impractical due to the inherent characteristics of data stream environments. Especially under high load conditions, there is often a shortage of system resources to process the incoming transactions. This causes unwanted latencies that in turn, affects the applicability of the data mining models produced – which often has a small window of opportunity. We propose a load shedding algorithm to address this issue. The algorithm adaptively detects overload situations and drops transactions from data streams using a probabilistic model. We tested our algorithm on both synthetic and real-life datasets to verify the feasibility of our algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows

Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...

متن کامل

Mining Data Stream for Load Shedding

Data stream is continuous flow of data, which necessitates load shedding for data stream processing system. Here we study overload handling for frequent pattern mining indata streams. Here in this paper load shedding use frequent pattern matching algorithm i.e priority, transaction and attribute in overload situation. The heavy workload or continues stream of the mining algorithm lies mostly in...

متن کامل

An Efficient Sequential Frequent Pattern Analysis Using DBCA

In this work, the practical problem of frequent-itemset discovery in data-stream environments which may suffer from data overload. The main issues include frequent-pattern mining and data-overload handling. Therefore, a mining algorithm together with Separate dedicated overload-handling mechanisms is proposed. The algorithm DBCA (Dynamic Base Combinatorial Algorithm) extracts basic information ...

متن کامل

Loadstar: A Load Shedding Scheme for Classifying Data Streams

We consider the problem of resource allocation in mining multiple data streams. Due to the large volume and the high speed of streaming data, mining algorithms must cope with the effects of system overload. How to realize maximum mining benefits under resource constraints becomes a challenging task. In this paper, we propose a load shedding scheme for classifying multiple data streams. We focus...

متن کامل

Loadstar: Load Shedding in Data Stream Mining

In this demo, we show that intelligent load shedding is essential in achieving optimum results in mining data streams under various resource constraints. The Loadstar system introduces load shedding techniques to classifying multiple data streams of large volume and high speed. Loadstar uses a novel metric known as the quality of decision (QoD) to measure the level of uncertainty in classificat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006