Distributed Processing of Conjunctive Queries
نویسندگان
چکیده
We distinguish that Web query processing is composed of two phases: (a) retrieving information on documents related to the queries and ranking them, and (b) generating snippets, title, and URL information for the answer page. Using real data and a small cluster of index servers, we study four basic and key issues related to this first phase of query processing: load balance, broker behavior, performance by individual index servers, and overall throughput. Our study reveals interesting tradeoffs: (1) that load unbalance at low query arrival rates can be controlled with a simple measure of randomizing the distribution of documents among the index servers, (2) that the broker is not a bottleneck, (3) that disk and CPU utilization at individual servers depends on the relationship between memory size and the distribution of frequencies for the query terms, and (4) that load unbalance at high loads prevents higher throughput.
منابع مشابه
Publish/Subscribe with RDF Data over Large Structured Overlay Networks
We study the problem of evaluating RDF queries over structured overlay networks. We consider the publish/subscribe scenario where nodes subscribe with long-standing queries and receive notifications whenever triples matching their queries are inserted in the network. In this paper we focus on conjunctive multi-predicate queries. We demonstrate that these queries are useful in various modern app...
متن کاملConjunctive Query Answering in Distributed Ontology Systems for Ontologies with Large OWL ABoxes
We present a query processing procedure for conjunctive queries in distributed ontology systems where a large ontology is divided into ontology fragments that are later distributed over a set of autonomous nodes. We focus on ontologies with large ABoxes. The query processing procedure determines and retrieves the facts that are relevant to answering a given query from other nodes, then construc...
متن کاملDistributed Processing of Generalized Graph-Pattern Queries in SPARQL 1.1
We propose an efficient and scalable architecture for processing generalized graph-pattern queries as they are specified by the current W3C recommendation of the SPARQL 1.1 “Query Language” component. Specifically, the class of queries we consider consists of sets of SPARQL triple patterns with labeled property paths. From a relational perspective, this class resolves to conjunctive queries of ...
متن کاملEvaluating Conjunctive Triple Pattern Queries over Large Structured Overlay Networks
We study the problem of evaluating conjunctive queries composed of triple patterns over RDF data stored in distributed hash tables. Our goal is to develop algorithms that scale to large amounts of RDF data, distribute the query processing load evenly and incur little network traffic. We present and evaluate two novel query processing algorithms with these possibly conflicting goals in mind. We ...
متن کاملQuery Folding
Query folding refers to the activity of determining if and how a query can be answered using a given set of resources, which might be materialized views, cached results of previous queries, or queries answerable by another database. We investigate query folding in the context where queries and resources are conjunctive queries. We develop an exponential-time algorithm that nds all foldings, and...
متن کاملSearch for the Best but Expect the Worst - Distributed Top-k Queries over Decreasing Aggregated Scores
We consider distributed top-k queries in wide-area networks where the index lists for the attribute values (or text terms) of a query are distributed across a number of data peers. In contrast to existing work, we exclusively consider distributed top-k queries over decreasing aggregated values. State-of-the-art distributed top-k algorithms usually depend on threshold propagation to reduce expen...
متن کامل