POIESIS: a Tool for Quality-aware ETL Process Redesign

نویسندگان

  • Vasileios Theodorou
  • Alberto Abelló
  • Maik Thiele
  • Wolfgang Lehner
چکیده

We present a tool, called POIESIS, for automatic ETL process enhancement. ETL processes are essential data-centric activities in modern business intelligence environments and they need to be examined through a viewpoint that concerns their quality characteristics (e.g., data quality, performance, manageability) in the era of Big Data. POIESIS responds to this need by providing a user-centered environment for quality-aware analysis and redesign of ETL flows. It generates thousands of alternative flows by adding flow patterns to the initial flow, in varying positions and combinations, thus creating alternative design options in a multidimensional space of di↵erent quality attributes. Through the demonstration of POIESIS we introduce the tool’s capabilities and highlight its e ciency, usability and modifiability, thanks to its polymorphic design.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automating User-Centered Design of Data-Intensive Processes

Business Intelligence (BI) enables organizations to collect and analyze internal and external business data to generate knowledge and business value, and provide decision support at the strategic, tactical, and operational levels. The consolidation of data coming from many sources as a result of managerial and operational business processes, usually referred to as ExtractTransform-Load (ETL) is...

متن کامل

ETL and Data Quality : Which Comes First ?

Usually, an early task in any data warehousing project is a detailed examination of the source systems, including an audit of data quality. Data quality issues could include inconsistent data representation, missing data and difficulty around understanding relationships between the various source systems. As ETL and Data Quality technologies converge, it’s important to use the right tools at th...

متن کامل

135-2011: Best Solutions for Tuning Performance of ETL Jobs in SAS® Data Integration Studio

SAS® Data Integration Studio is a great tool for building and maintaining data warehouses and data marts. The performance of the extract, transform, and load (ETL) job is critical for building data warehouses and data marts. This paper discusses the time-consuming data transformations related to ETL processes in SAS Data Integration Studio. The performance for each data transformation is benchm...

متن کامل

Prototype of a Web ETL Tool

Extract, transform and load (ETL) is a process that makes it possible to extract data from operational data sources, to transform data in the way needed for data warehousing purposes and to load data into a data warehouse (DW). ETL process is the most important part when building the data warehouse. Because the ETL process is a very complex and time consuming, this paper presents a prototype of...

متن کامل

An Open Source ETL Tool - Medium and Small Scale Enterprise ETL(MaSSEETL)

In Data Warehouse (DW) environment, Extraction-Transformation-Loading (ETL) processes consumes up to 70% of resources. Data quality tools aim at detecting and correcting data problems that affect the accuracy and efficiency of data analysis applications. Source data imported into the data warehouse often has different quality, format, coding etc. In order to bring all the data together in a sta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015