Saving DBMS Resources While Running Batch Cycles in Data Warehouses

نویسنده

  • Nayem Rahman
چکیده

In a large data warehouse, thousands of jobs run during each cycle in dozens of subject areas. Many of the data warehouse tables are quite large and they need to be refreshed at the right time, several times a day, to support strategic business decisions. To enable cycles to run more frequently and keep the data warehouse environment stable the database system’s resource utilization must be optimal. This paper discusses refreshing data warehouses using a metadata model to make sure jobs under batch cycles run on an as-needed basis. The metadata model limits execution of the stored procedures in different analytical subject areas to source data changes in the source staging subject area tables, and then implements refreshes of analytical tables for which new data has arrived from the operational databases. The load is skipped if source data has not changed. Skipping unnecessary loads via this metadata driven approach enables significant database resources savings. The resource savings statistics based on an actual production data warehouse demonstrate an excellent reduction of computing resources consumption achieved by the proposed techniques. DOI: 10.4018/978-1-4666-1752-0.ch009

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interactive Predictive Analytics with Columnar Databases

Predictive Analytics is usually seen as highly interactive task. Paradoxically , it is still performed mostly as a batch task. This does not only limit its applicability , it also sets it apart from a task that is conceptually very close to it, namely OLAP analysis. The main reason for considering mining a batch task is the usually very high execution time on large data warehouses. While novel ...

متن کامل

Valuation Factors for the Necessity of Data Persistence in Enterprise Data Warehouses on In-Memory Databases

ETL (extraction, transformation, and loading) and data staging processes in Enterprise Data Warehouses have always been critical due to their consumption of time and resources. Mostly, the staging processes are accompanied with persistent storage of transformed data to enable a reasonable performance when accessing for analysis and other purposes. The persistence of – often redundant – data req...

متن کامل

Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems

Providing the ability to increase or decrease allocated resources on demand as the transactional load varies is essential for database management systems (DBMS) deployed on today’s computing platforms, such as the cloud. The need to maintain consistency of the database, at very large scales, while providing high performance and reliability makes elasticity particularly challenging. In this thes...

متن کامل

History-Based Harvesting of Spare Cycles and Storage in Large-Scale Datacenters

An effective way to increase utilization and reduce costs in datacenters is to co-locate their latency-critical services and batch workloads. In this paper, we describe systems that harvest spare compute cycles and storage space for co-location purposes. The main challenge is minimizing the performance impact on the services, while accounting for their utilization and management patterns. To ov...

متن کامل

Column Stores for Wide and Sparse Data

While it is generally accepted that data warehouses and OLAP workloads are excellent applications for column-stores, this paper speculates that column-stores may well be suited for additional applications. In particular we observe that column-stores do not see a performance degradation when storing extremely wide tables, and column-stores handle sparse data very well. These two properties lead ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJTD

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2010