Eecient Dynamic Programming Algorithms for Ordering Expensive Joins and Selections
نویسندگان
چکیده
The generally accepted optimization heuristics of pushing selections down does not yield optimal plans in the presence of expensive predicates. Therefore, several researchers have proposed algorithms to compute optimal processing trees for queries with expensive predicates. All these approaches are incorrect|with one exception 3]. Our contribution is as follows. We present a formally derived and correct dynamic programming algorithm to compute optimal bushy processing trees for queries with expensive predicates. This algorithm is then enhanced to be able to (1) handle several join algorithms including sort merge with a correct handling of interesting sort orders, to (2) perform predicate splitting, to (3) exploit structural information about the query graph to cut down the search space. Further, we present eecient implementations of the algorithms. More speciically we introduce unique solutions for eeciently computing the cost of the intermediate plans and for saving memory space by utilizing bitvector contraction. Our implementations impose no restrictions on the type of query graphs, the shape of processing trees or the class of cost functions. We establish the correctness of our algorithms and derive tight asymptotic bounds on the worst case time and space complexities. We also report on a series of benchmarks showing that queries of sizes which are likely to occur in practice can be optimized over the unconstrained search space in less than a second.
منابع مشابه
Optimal Ordering of Selections and Joins in Acyclic Queries with Expensive Predicates
The generally accepted optimization heuristics of pushing selections down does not yield optimal plans in the presence of expensive predicates. Therefore, several researchers have proposed algorithms for the optimal ordering of expensive joins and selections in a query evaluation plan. All of these algorithms have an exponential run time. For a special case, we propose a polynomial algorithm wh...
متن کاملTerm paper: Randomized Algorithms and Heuristics for Join Ordering
In the relational database setting today, large queries containing many joins are becoming increasingly common. In general the ordering of join-operations is quite sensitive and can have a devastatingly negative effect on the efficiency of the DBMS. Scheufele and Moerkotte proved that join-ordering is NP-complete in the general case [4]. For smaller queries however, less than approximately 10 j...
متن کاملBypassing Joins in Disjunctive Queries
The bypass technique, which was formerly restricted to selections only KMPS94], is extended to join operations. Analogous to the selection case, the join operator may generate two output streams|the join result and its complement|whose subsequent operator sequence is optimized individually. By extending the bypass technique to joins, several problems have to be solved. (1) An algorithm for exha...
متن کاملOn the Optimal Ordering of Maps, Selections, and Joins Under Factorization
We examine the problem of producing the optimal evaluation order for queries containing joins, selections, and maps. Specifically, we look at the case where common subexpressions involving expensive UDF calls can be factored out. First, we show that ignoring factorization during optimization can lead to plans that are far off the best possible plan: the difference in cost between the best plan ...
متن کاملA Mathematical Optimization Model for Solving Minimum Ordering Problem with Constraint Analysis and some Generalizations
In this paper, a mathematical method is proposed to formulate a generalized ordering problem. This model is formed as a linear optimization model in which some variables are binary. The constraints of the problem have been analyzed with the emphasis on the assessment of their importance in the formulation. On the one hand, these constraints enforce conditions on an arbitrary subgraph and then g...
متن کامل