Latent-class statistical relational learning with uncertain formal ontologies

نویسنده

  • Achim Rettinger
چکیده

Ontologies are a well researched formalism in computer science to represent knowledge in a machine processable manner. Recently, with the growing semantic web and social networks, ontologies with large amounts of data are becoming available. While such knowledge representations are intended for data integration and deduction of implicit knowledge, they are in general not designed to induce uncertain knowledge. This thesis focuses on the statistical analysis of known as well as deducible facts in ontologies and the induction of uncertain knowledge from both. Hereby, the uncertainty of induced knowledge originates not only from a lack of information but also from potentially untrustworthy sources providing the facts. We outline common ontology languages, their (formal) semantics and expressivity from a perspective of existing real world data sources. Then, existing learning methods are discussed that could be used for automated statistical analysis of facts given in such ontologies and the induction of facts that are not explicitly given in the ontology. Hereby, two fundamental approaches to artificial intelligence need to be combined: The deductive reasoning by logical consequence and the inductive learning by statistical generalization. The main contribution of this work is a machine learning approach that enforces hard constraints during the learning phase. In short, this is achieved by checking description logic satisfiability of statements inductively inferred by an infinite latent-class multi-relational Bayesian learning method based on Dirichlet process mixture models. This guarantees the compliance of induced facts with deductive arguments and can lead to an improved cluster analysis and predictive performance. To demonstrate the effectiveness of our approach we provide experiments using real world social network data in form of an OWL DL ontology. In addition, the learning from untrustworthy facts in formal knowledge representations is discussed. We show how to model and learn context-sensitive trust using our learning method. The efficiency and performance of our approach is evaluated empirically e. g. , with user-ratings gathered from online auctions. In summary, this thesis contributes to the combination of inductive and deductive reasoning approaches and more specifically to latent-class statistical learning with description logic ontologies containing information from potentially untrustworthy sources.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Latent-Class Statistical Relational Learning from Formal Knowledge

We propose a learning approach for integrating formal knowledge into statistical inference by exploiting ontologies as a semantically rich and fully formal representation of prior knowledge. The logical constraints deduced from ontologies can be utilized to enhance and control the learning task by enforcing description logic satisfiability in a latent multi-relational graphical model. To demons...

متن کامل

Statistical Relational Learning with Formal Ontologies

We propose a learning approach for integrating formal knowledge into statistical inference by exploiting ontologies as a semantically rich and fully formal representation of prior knowledge. The logical constraints deduced from ontologies can be utilized to enhance and control the learning task by enforcing description logic satisfiability in a latent multi-relational graphical model. To demons...

متن کامل

Integrating Ontological Prior Knowledge into Relational Learning

Ontologies represent an important source of prior information which lends itself to the integration into statistical modeling. This paper discusses approaches towards employing ontological knowledge for relational learning. Our analysis is based on the IHRM model that performs relational learning by including latent variables that can be interpreted as cluster variables of the entities in the d...

متن کامل

Terminological ontology learning and population using latent Dirichlet allocation

The success of Semantic Web will heavily rely on the availability of formal ontologies to structure machine understanding data. However, there is still a lack of general methodologies for ontology automatic learning and population, i.e. the generation of domain ontologies from various kinds of resources by applying natural language processing and machine learning techniques In this paper, the a...

متن کامل

Demand-Driven Clustering in Relational Domains for Predicting Adverse Drug Events

Learning from electronic medical records (EMR) is challenging due to their relational nature and the uncertain dependence between a patient's past and future health status. Statistical relational learning is a natural fit for analyzing EMRs but is less adept at handling their inherent latent structure, such as connections between related medications or diseases. One way to capture the latent st...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010