Learning Source Description for Data Integration

نویسندگان

  • AnHai Doan
  • Pedro M. Domingos
  • Alon Y. Halevy
چکیده

To build a data-integration system, the application designer must specify a mediated schema and supply the descriptions of data sources. A source description contains a source schema that describes the content of the source, and a mapping between the corresponding elements of the source schema and the mediated schema. Manually constructing these mappings is both labor-intensive and error-prone, and has proven to be a major bottleneck in deploying large-scale data integration systems in practice. In this paper we report on our initial work toward automatically learning mappings between source schemas and the mediated schema. Specifically, we investigate finding one-to-one mappings for the leaf elements of source schemas. We describe LSD, a system that automatically finds such mappings. LSD consults a set of learner modules – where each module looks at the problem from a different perspective, then combines the predictions of the modules using a meta-learner. We report on experimental results of applying LSD to five sources in the real-estate domain.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Integration: A “Killer App” for Multistrategy Learning

To build a data-integration system, the application designer must specify a mediated schema and supply the descriptions of data sources. A source description contains a source schema that describes the content of the source, and a mapping between the corresponding elements of the source schema and the mediated schema. Manually constructing these mappings is both laborintensive and error-prone, ...

متن کامل

Learning Source Descriptions for Data Integration

To build a data-integration system, the application designer must specify a mediated schema and supply the descriptions of data sources. A source description contains a source schema that describes the content of the source, and a mapping between the corresponding elements of the source schema and the mediated schema. Manually constructing these mappings is both labor-intensive and error-prone,...

متن کامل

Learning Mappings between Data Schemas

To build a data-integration system, the application designer must specify a mediated schema and supply the descriptions of data sources. A source description contains a source schema that describes the content of the source, and a mapping between the corresponding elements of the source schema and the mediated schema. Manually constructing these mappings is both laborintensive and error-prone, ...

متن کامل

An Online Q-learning Based Multi-Agent LFC for a Multi-Area Multi-Source Power System Including Distributed Energy Resources

This paper presents an online two-stage Q-learning based multi-agent (MA) controller for load frequency control (LFC) in an interconnected multi-area multi-source power system integrated with distributed energy resources (DERs). The proposed control strategy consists of two stages. The first stage is employed a PID controller which its parameters are designed using sine cosine optimization (SCO...

متن کامل

Image alignment via kernelized feature learning

Machine learning is an application of artificial intelligence that is able to automatically learn and improve from experience without being explicitly programmed. The primary assumption for most of the machine learning algorithms is that the training set (source domain) and the test set (target domain) follow from the same probability distribution. However, in most of the real-world application...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000