CLARINET: WAN-Aware Optimization for Analytics Queries

نویسندگان

  • Raajay Viswanathan
  • Ganesh Ananthanarayanan
  • Aditya Akella
چکیده

Recent work has made the case for geo-distributed analytics, where data collected and stored at multiple datacenters and edge sites world-wide is analyzed in situ to drive operational and management decisions. A key issue in such systems is ensuring low response times for analytics queries issued against geo-distributed data. A central determinant of response time is the query execution plan (QEP). Current query optimizers do not consider the network when deriving QEPs, which is a key drawback as the geo-distributed sites are connected via WAN links with heterogeneous and modest bandwidths, unlike intra-datacenter networks. We propose CLARINET, a novel WAN-aware query optimizer. Deriving a WAN-aware QEP requires working jointly with the execution layer of analytics frameworks that places tasks to sites and performs scheduling. We design efficient heuristic solutions in CLARINET to make such a joint decision on the QEP. Our experiments with a real prototype deployed across EC2 datacenters, and large-scale simulations using production workloads show that CLARINET improves query response times by ≥ 50% compared to state-of-the-art WAN-aware task placement and scheduling.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Redoop: Supporting Recurring Queries in Hadoop

The growing demand for large-scale data analytics ranging from online advertisement placement, log processing, to fraud detection, has led to the design of highly scalable data-intensive computing infrastructures such as the Hadoop platform. Recurring queries, repeatedly being executed for long periods of time on rapidly evolving high-volume data, have become a bedrock component in most of thes...

متن کامل

Context-Aware Event Stream Analytics

Complex event processing is a popular technology for continuously monitoring high-volume event streams from health care to traffic management to detect complex compositions of events. These event compositions signify critical “application contexts” from hygiene violations to traffic accidents. Certain event queries are only appropriate in particular contexts. Yet state-of-the-art streaming engi...

متن کامل

The MemSQL Query Optimizer: A modern optimizer for real-time analytics in a distributed database

Real-time analytics on massive datasets has become a very common need in many enterprises. These applications require not only rapid data ingest, but also quick answers to analytical queries operating on the latest data. MemSQL is a distributed SQL database designed to exploit memory-optimized, scale-out architecture to enable real-time transactional and analytical workloads which are fast, hig...

متن کامل

Adaptive Energy-aware Scheduling of Dynamic Event Analytics across Edge and Cloud Resources

The growing deployment of sensors as part of Internet of Things (IoT) is generating thousands of event streams. Complex Event Processing (CEP) queries offer a useful paradigm for rapid decision-making over such data sources. While often centralized in the Cloud, the deployment of capable edge devices on the field motivates the need for cooperative event analytics that span Edge and Cloud comput...

متن کامل

Preference Analytics in EXASolution

Skyline queries and the more general concept of preferences are wellknown in the database community and there are many academic approaches for the computation of the best-matching objects. Furthermore, data analytics and multicriteria optimization play an important role in Business Intelligence where it facilitates optimal decision making. Preference Analytics is the combination of preferences ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016