Multi-Relational Data Mining using Probabilistic Models Research Summary

نویسنده

  • Lise Getoor
چکیده

We are often faced with the challenge of mining data represented in relational form. Unfortunately, most statistical learning methods work only with “flat” data representations. Thus, to apply these methods, we are forced to convert the data into a flat form, thereby not only losing its compact representation and structure but also potentially introducing statistical skew. These drawbacks severely limit the ability of current statistical methods to mine relational databases. Probabilistic models, in particular probabilistic relational models, allow us to represent a statistical model over a relational domain. These models can represent correlations between attributes within a single table, and between attributes in multiple tables, when these tables are related via foreign key joins. In previous work [4, 6, 8], we have developed algorithms for automatically constructing a probabilistic relational model directly from a relational database. We survey the results here and describe how the methods can be used to discover interesting dependencies the data. We show how this class of models and our construction algorithm are ideally suited to mining multi-relational data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Multi-Relational Semantics Using Neural-Embedding Models

Real-world entities (e.g., people and places) are often connected via relations, forming multirelational data. Modeling multi-relational data is important in many research areas, from natural language processing to biological data mining [6]. Prior work on multi-relational learning can be categorized into three categories: (1) statistical relational learning (SRL) [10], such as Markovlogic netw...

متن کامل

Relational Data Mining Using Probabilistic Relational Models

This thesis documents the design, implementation and test of Probabilistic Relational Models (PRMs). PRMs are a graphical statistical approach to modeling relational data using the Relational Language. PRMs consist of two components; the dependency structure and the parameters. Our design is based on simplicity, flexibility , and performance. We explain the search over possible structures, usin...

متن کامل

Multi-relational Data Mining in Medical Databases

This paper presents the application of a method for mining data in a multi-relational database that contains some information about patients strucked down by chronic hepatitis. Our approach may be used on any kind of multirelational database and aims at extracting probabilistic tree patterns from a database using Grammatical Inference techniques. We propose to use a representation of the databa...

متن کامل

Probabilistic Relational Model Benchmark Generation

The validation of any database mining methodology goes through an evaluation process where benchmarks availability is essential. In this paper, we aim to randomly generate relational database benchmarks that allow to check probabilistic dependencies among the attributes. We are particularly interested in Probabilistic Relational Models (PRMs), which extend Bayesian Networks (BNs) to a relationa...

متن کامل

An efficient approach for effectual mining of relational patterns from multi-relational database

Data mining is an extremely challenging and hopeful research topic due to its well-built application potential and the broad accessibility of the massive quantities of data in databases. Still, the rising significance of data mining in practical real world necessitates ever more complicated solutions while data includes of a huge amount of records which may be stored in various tables of a rela...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001