POIESIS: a Tool for Quality-aware ETL Process Redesign
نویسندگان
چکیده
We present a tool, called POIESIS, for automatic ETL process enhancement. ETL processes are essential data-centric activities in modern business intelligence environments and they need to be examined through a viewpoint that concerns their quality characteristics (e.g., data quality, performance, manageability) in the era of Big Data. POIESIS responds to this need by providing a user-centered environment for quality-aware analysis and redesign of ETL flows. It generates thousands of alternative flows by adding flow patterns to the initial flow, in varying positions and combinations, thus creating alternative design options in a multidimensional space of di↵erent quality attributes. Through the demonstration of POIESIS we introduce the tool’s capabilities and highlight its e ciency, usability and modifiability, thanks to its polymorphic design.
منابع مشابه
Automating User-Centered Design of Data-Intensive Processes
Business Intelligence (BI) enables organizations to collect and analyze internal and external business data to generate knowledge and business value, and provide decision support at the strategic, tactical, and operational levels. The consolidation of data coming from many sources as a result of managerial and operational business processes, usually referred to as ExtractTransform-Load (ETL) is...
متن کاملETL and Data Quality : Which Comes First ?
Usually, an early task in any data warehousing project is a detailed examination of the source systems, including an audit of data quality. Data quality issues could include inconsistent data representation, missing data and difficulty around understanding relationships between the various source systems. As ETL and Data Quality technologies converge, it’s important to use the right tools at th...
متن کامل135-2011: Best Solutions for Tuning Performance of ETL Jobs in SAS® Data Integration Studio
SAS® Data Integration Studio is a great tool for building and maintaining data warehouses and data marts. The performance of the extract, transform, and load (ETL) job is critical for building data warehouses and data marts. This paper discusses the time-consuming data transformations related to ETL processes in SAS Data Integration Studio. The performance for each data transformation is benchm...
متن کاملPrototype of a Web ETL Tool
Extract, transform and load (ETL) is a process that makes it possible to extract data from operational data sources, to transform data in the way needed for data warehousing purposes and to load data into a data warehouse (DW). ETL process is the most important part when building the data warehouse. Because the ETL process is a very complex and time consuming, this paper presents a prototype of...
متن کاملAn Open Source ETL Tool - Medium and Small Scale Enterprise ETL(MaSSEETL)
In Data Warehouse (DW) environment, Extraction-Transformation-Loading (ETL) processes consumes up to 70% of resources. Data quality tools aim at detecting and correcting data problems that affect the accuracy and efficiency of data analysis applications. Source data imported into the data warehouse often has different quality, format, coding etc. In order to bring all the data together in a sta...
متن کامل