The Impact of Buffering on Closest Pairs Queries Using R-Trees
نویسندگان
چکیده
In this paper, the most appropriate buffer structure, page replacement policy and buffering scheme for closest pairs queries, where both spatial datasets are stored in R-trees, are investigated. Three buffer structures (i.e. single, hybrid and by levels) over two buffering schemes (i.e. local to each R-tree, and global to the query) using several page replacement algorithms (e.g. FIFO, LRU, 2Q, etc.) are studied. In order to answer K closest pair queries (K-CPQs, with K ≥ 1) we employ recursive and non-recursive (iterative) branch-and-bound algorithms. The outcome of this study is the derivation of the outperforming configuration (in terms of buffer structure, page replacement algorithm and buffering scheme) for CPQs. In all cases, the savings in disk accesses is larger for a recursive algorithm than for a non-recursive one, in the presence of buffer space. Also, the global buffering scheme is more appropriate for small or medium buffer sizes for recursive algorithms, whereas the local scheme is the best choice for large buffers. If we use non-recursive algorithms, the global buffering scheme is the best choice in all cases. Moreover, LRU is the most appropriate page replacement algorithm for small or medium buffer sizes for both types of branch-and-bound algorithms. FIFO and LRU are the best choices for recursive algorithms and 2Q for the non-recursive ones, when the buffer is large enough.
منابع مشابه
Cost models for distance joins queries using R-trees
The K-Closest-Pairs Query (K-CPQ), a type of distance join in spatial databases, discovers the K pairs of objects formed from two different datasets with the K smallest distances. Recently, branch-and-bound algorithms based on R-trees have been developed in order to answer K-CPQs efficiently. For query optimization purposes, analytical models are needed to estimate the processing cost of a spec...
متن کاملVA-Files vs. R*-Trees in Distance Join Queries
In modern database applications the similarity of complex objects is examined by performing distance-based queries (e.g. nearest neighbour search) on data of high dimensionality. Most multidimensional indexing methods have failed to efficiently support these queries in arbitrary high-dimensional datasets (due to the dimensionality curse). Similarity join queries and K closest pairs queries are ...
متن کاملAn index structure for improving closest pairs and related join queries in spatial databases
Spatial databases have grown in importance in various fields. Together with them come various types of queries that need to be answered effectively. While queries involving single data set have been studied extensively, join queries on multi-dimensional data like the k-closest pairs and the nearest neighbor joins have only recently received attention. In this paper, we propose a new index struc...
متن کاملSpatial Queries in the Presence of Obstacles
Despite the existence of obstacles in many database applications, traditional spatial query processing utilizes the Euclidean distance metric assuming that points in space are directly reachable. In this paper, we study spatial queries in the presence of obstacles, where the obstructed distance between two points is defined as the length of the shortest path that connects them without crossing ...
متن کاملEstimation of Density using Plotless Density Estimator Criteria in Arasbaran Forest
Sampling methods have a theoretical basis and should be operational in different forests; therefore selecting an appropriate sampling method is effective for accurate estimation of forest characteristics. The purpose of this study was to estimate the stand density (number per hectare) in Arasbaran forest using a variety of the plotless density estimators of the nearest neighbors sampling me...
متن کامل