Sorted Neighborhood for Schema-free RDF Data

نویسندگان

  • Mayank Kejriwal
  • Daniel P. Miranker
چکیده

Entity Resolution (ER) concerns identifying pairs of entities that refer to the same underlying entity. To avoid O(n) pairwise comparison of n entities, blocking methods are used. Sorted Neighborhood is an established blocking method for Relational Databases. It has not been applied to schema-free Resource Description Framework (RDF) data sources widely prevalent in the Linked Data ecosystem. This paper presents a Sorted Neighborhood workflow that may be applied to schema-free RDF data. The workflow is modular and makes minimal assumptions about its inputs. Empirical evaluations of the proposed algorithm on five real-world benchmarks demonstrate its utility compared to two state-of-the-art blocking baselines.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sorted Neighborhood for the Semantic Web

Entity Resolution (ER) concerns identifying logically equivalent entity pairs across databases. To avoid Θ(n) pairwise comparisons of n entities, blocking methods are used. Sorted Neighborhood is an established blocking method for relational databases. It has not been applied on graph-based data models such as the Resource Description Framework (RDF). This poster presents a modular workflow for...

متن کامل

ExpLOD: Exploring Interlinking and RDF Usage in the Linked Open Data Cloud

The Linking Open Data community project is promoting the creation of interlinked RDF datasets with links between data items identified using dereferenceable URIs. This promising direction for publishing data on the web brings forward a number of issues. A key challenge is to understand the data, the schema, and the interlinks that are actually used both within and across linked datasets. Unders...

متن کامل

An Improved Semantic Schema Matching Approach

Schema matching is a critical step in many applications, such as data warehouse loading, Online Analytical Process (OLAP), Data mining, semantic web [2] and schema integration. This task is defined for finding the semantic correspondences between elements of two schemas. Recently, schema matching has found considerable interest in both research and practice. In this paper, we present a new impr...

متن کامل

Functional Queries to Wrapped Educational Semantic Web Meta-Data

The aim of the Edutella project is to provide a peer-to-peer infrastructure for educational material retrieval using semantic web meta-data descriptions of educational resources. Edutella uses the semantic web meta-data description languages RDF and RDF-Schema for describing web resources. The aim of this work is to wrap the Edutella infrastructure with a functional mediator system. This makes ...

متن کامل

Object-Oriented RuleML: User-Level Roles, URI-Grounded Clauses, and Order-Sorted Terms

This paper describes an Object-Oriented extension to RuleML as a modular combination of three sublanguages. (1) User-level roles provide frame-like slot representations as unordered argument collections in atoms and complex terms. (2) URI-grounded clauses allow for ‘webizing’ using URIs as object identifiers for facts and rules. (3) Ordersorted terms permit typed variables via Web links into ta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015