The Effect of Index Partitioning Schemes on the Performance of Distributed Query Processing
نویسندگان
چکیده
The benefit of using indexes for processing coqjunctive queries in a database system is well known. The use of indexes in distributed database systems is equally justified. In a distributed database environment a relation may be horizontally partitioned across the nodes of the system and indexes may be created for the fragment of the relation that resides at each node. However, as an alternative, one might construct each index on the entire relation, i.e., global indexes, and then partition each index between the nodes. An approach is presented for processing such an index partitioning scheme in response to a coqjunctive range query. The performance of these schemes is evaluated in terms of the response time of a query and the utilization of processors, disk, and communication network while varying the number of nodes and query mix. Index remDistributed database system, query processing, indexing scheme, partitioned global index, partial index, coqjunctive queries, simulation, performance evaluation.
منابع مشابه
Effect of Inverted Index Partitioning Schemes on Performance of Query Processing in Parallel Text Retrieval Systems
Shared-nothing, parallel text retrieval systems require an inverted index, representing a document collection, to be partitioned among a number of processors. In general, the index can be partitioned based on either the terms or documents in the collection, and the way the partitioning is done greatly affects the query processing performance of the parallel system. In this work, we investigate ...
متن کاملDistributed Query Processing Using Partitioned Inverted Files
In this paper, we study query processing in a distributed text database. The novelty is a real distributed architecture implementation that offers concurrent query service. The distributed system adopts a network of workstations model and the client-server paradigm. The document collection is indexed with an inverted file. We adopt two distinct strategies of index partitioning in the distribute...
متن کاملIndex File Partitioning in Parallel Database Systems
In a parallel database system, a table is often partitioned into multiple fragments and stored on diierent nodes in order to exploit I/O parallelism. Since using an index is typical for processing a database query, the problem of how to design the index for such partitioned tables can be a crucial performance factor in a parallel database. In terms of the index for partitioned tables, we can th...
متن کاملOn the Impact of Random Index-Partitioning on Index Compression
The performance of processing search queries depends heavily on the stored index size. Accordingly, considerable research efforts have been devoted to the development of efficient compression techniques for inverted indexes. Roughly, index compression relies on two factors: the ordering of the indexed documents, which strives to position similar documents in proximity, and the encoding of the i...
متن کاملSAMUEL: A Sharing-based Approach to processing Multiple SPARQL Queries with MapReduce
The volume of RDF data is now growing tremendously. It is thus considered prudent to store and process massive RDF data with distributed SPARQL engines instead of relying on a singlemachine system.Many sophisticated index and partitioning schemes have also been proposed to support SPARQL query evaluations. However, existing SPARQL engines have mainly followed oneat-a-time scheme so that query e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Knowl. Data Eng.
دوره 5 شماره
صفحات -
تاریخ انتشار 1993