Processing Aggregate Queries in a Federation of SPARQL Endpoints
نویسندگان
چکیده
More andmore RDF data is exposed on theWeb via SPARQL endpoints. With the recent SPARQL 1.1 standard, these datasets can be queried in novel and more powerful ways, e.g., complex analysis tasks involving grouping and aggregation, and even data frommultiple SPARQL endpoints, can now be formulated in a single query. This enables Business Intelligence applications that access data from federated web sources and can combine it with local data. However, as both aggregate and federated queries have become available only recently, state-of-the-art systems lack sophisticated optimization techniques that facilitate efficient execution of such queries over large datasets. To overcome these shortcomings, we propose a set of query processing strategies and the associated Costbased Optimizer for Distributed Aggregate queries (CoDA) for executing aggregate SPARQL queries over federations of SPARQL endpoints. Our comprehensive experiments show that CoDA significantly improves performance over current state-of-the-art systems.
منابع مشابه
An Evaluation of SPARQL Federation Engines Over Multiple Endpoints
Due to decentralized and linked architecture underlying Linking Data, running complex queries often require collecting data from multiple RDF datasets. The optimization of the runtime of such queries, called federated queries, is of central importance to ensure the scalability of Semantic-Web and Linked-Data-driven applications. This has motivated a considerable body of work on SPARQL query fed...
متن کاملFederated Query Formulation and Processing through BioFed
A single interface for accessing life sciences (LS) data is a natural need to master the data deluge in this domain. The data in the LS requires integration and current integrative solutions increasingly rely on the federation of queries for distributed resources. This paper demonstrates BioFed, a federated SPARQL query processing system customised for LS-LOD. BioFed enables user to formulate a...
متن کاملSILURIAN: a Sparql vIsuaLizer for UndeRstanding querIes And federatioNs
SPARQL federated queries can be affected by both characteristics of the query and datasets in the federation. We present SILURIAN a Sparql visualizer for understanding queries and federations. SILURIAN visualizes SPARQL queries and, thus, it allows the analysis and understanding of a query complexity with respect to relevant endpoints and shapes of the possible plans.
متن کاملSPORTAL: Profiling the Content of Public SPARQL Endpoints
Access to hundreds of knowledge-bases has been made available on the Web through public SPARQL endpoints. Unfortunately, few endpoints publish descriptions of their content (e.g., using VoID). It is thus unclear how agents can learn about the content of a given SPARQL endpoint or, relatedly, find SPARQL endpoints with content relevant to their needs. In this paper, we investigate the feasibilit...
متن کاملFederating queries in SPARQL 1.1: Syntax, semantics and evaluation
Given the sustained growth that we are experiencing in the number of SPARQL endpoints available, the need to be able to send federated SPARQL queries across these has also grown. To address this use case, the W3C SPARQL working group is defining a federation extension for SPARQL 1.1 which allows for combining graph patterns that can be evaluated over several endpoints within a single query. In ...
متن کامل