Data Transformation for Warehousing Web Data
نویسندگان
چکیده
In order to analyze market trends and make reasonable business plans, a company’s local data is not sufficient. Decision making must also be based on information from suppliers, partners and competitors. This external data can be obtained from the Web in many cases, but must be integrated with the company’s own data, for example, in a data warehouse. To this end, Web data has to be mapped to the star schema of the warehouse. In this paper we propose a semi-automatic approach to support this transformation process. Our approach is based on the use a rooted labeled tree representation of Web data and the existing warehouse schema. Based on this common view we can compare source and target schemata to identify correspondences. We show how the correspondences guide the transformation to be accomplished automatically. We also explain the meaning of recursion and restructuring in mapping rules, which are the core of the transformation algorithm.
منابع مشابه
0 Dwhuldol ] Lqj : He ' Dwd
Business decisions must rely not only on company-internal data but also on external data from competitors or relevant events. This information can be obtained from the WWW but must be integrated with the data in a company's data warehouse. In this paper we discuss a system architecture for warehousing Web content for OLAP and DSS. A self-describing object model is used to make the implicit mode...
متن کاملObject-Oriented Data Warehousing
Data warehousing has largely developed with little or no reference to ObjectOriented Software Engineering (OOSE) [1]. This is consistent with (a) its development out of two-tier client/server relational database methodology, and (b) its character as a kind of high-level systems integration, rather than software development, activity. Data Warehousing assembles components, rather than creating t...
متن کاملQuery Optimization , Data Warehousing and Data Mining for Scientific Simulation
by Yingping Huang This thesis examines the application of infrastructure, query optimization, data warehousing and data mining technologies to the area of scientific simulation. One application of scientific simulation is on the behavior of natural organic matter (NOM). NOM is a heterogeneous mixture of organic molecules found in terrestrial and aquatic environment from forest soils and streams...
متن کاملChapter.i, " Combining Data Warehousing and Data Mining Techniques for Web Log Analysis "
In enterprises, a large volume of data has been collected and stored in data warehouses. Advances in data gathering, storage, and distribution have created a need for integrating data warehousing and data mining techniques. Mining data warehouses raises unique issues and requires special attention. Data warehousing and data mining are interrelated , and require holistic techniques from the two ...
متن کاملWarehousing complex data from the web
The data warehousing and OLAP technologies are now moving onto handling complex data that mostly originate from the Web. However, intagrating such data into a decision-support process requires their representation under a form processable by OLAP and/or data mining techniques. We present in this paper a complex data warehousing methodology that exploits XML as a pivot language. Our approach inc...
متن کامل