Load Shedding in XML Streams

نویسندگان

  • Mingzhu Wei
  • Elke A. Rundensteiner
  • Murali Mani
چکیده

Because of the high volume and unpredictability arrival of data streams, stream processing systems may not always be able to keep up with the input — resulting in buffer overflow and uncontrolled loss of data. Load shedding, the prevalent strategy for solving this overflow problem, has todate been considered for relational stream engines. On the other hand face additional challenges and opportunities for ”structural shedding”, due to the complex nested XML input and result structures. We now tackle this open XML shedding problem by a three-pronged solution. First, we develop a preference model for XQuery to enable users to specify the relative importance of preserving different subpattern in the complex XML result structure. This transforms shedding into the problem of rewriting the user query into possibly several shedding queries that return approximate query answers yet with the highest possible utility as measured by the given user preference model. Two, we develop a cost model to compare both the performance and the utility of alternate shedding queries. Third,we propose two solutions: OptShed, and FastShed. OptShed guarantees to find an optimal solution however at the cost of an exponential complexity. FashShed as confirmed by our experiments, efficiently achieves a close-to-optimal result in a wide range of cases. Lastly we describe the in-automaton shedding mechanism for Raindrop system. The experimental results show that our proposed preference-driven shedding solutions always consistently achieve higher utility results compared to the existing “relational” shedding techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Delivering Qos in Xml Data Stream Processing Using Load Shedding

In recent years, we have witnessed the emergence of new types of systems that deal with large volumes of streaming data. Examples include financial data analysis on feeds of stock tickers, sensorbased environmental monitoring, network track monitoring and click stream analysis to push customized advertisements or intrusion detection. Traditional database management systems (DBMS), which are ver...

متن کامل

Continuously Providing Approximate Results under Limited Resources: Load Shedding and Spilling in XML Streams

Because of the high volume and unpredictable arrival rates, stream processing systems may not always be able to keep up with the input data streams, resulting in buffer overflow and uncontrolled loss of data. To continuously supply online results, two alternate solutions to tackle this problem of unpredictable failures of such overloaded systems can be identified. One technique, called load she...

متن کامل

Loadstar: A Load Shedding Scheme for Classifying Data Streams

We consider the problem of resource allocation in mining multiple data streams. Due to the large volume and the high speed of streaming data, mining algorithms must cope with the effects of system overload. How to realize maximum mining benefits under resource constraints becomes a challenging task. In this paper, we propose a load shedding scheme for classifying multiple data streams. We focus...

متن کامل

Loadstar: Load Shedding in Data Stream Mining

In this demo, we show that intelligent load shedding is essential in achieving optimum results in mining data streams under various resource constraints. The Loadstar system introduces load shedding techniques to classifying multiple data streams of large volume and high speed. Loadstar uses a novel metric known as the quality of decision (QoD) to measure the level of uncertainty in classificat...

متن کامل

Load Shedding in Data Stream Systems

Systems for processing continuous monitoring queries over data streams must be adaptive because data streams are often bursty and data characteristics may vary over time. In this chapter, we focus on one particular type of adaptivity: the ability to gracefully degrade performance via "load shedding" (dropping unprocessed tuples to reduce system load) when the demands placed on the system cannot...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013