Optimizing Sorting and Duplicate Elimination in XQuery Path Expressions
نویسندگان
چکیده
XQuery expressions can manipulate two kinds of order: document order and sequence order. While the user can impose or observe the order of items within a sequence, the results of path expressions must always be returned in document order. Correctness can be obtained by inserting explicit (and expensive) operations to sort and remove duplicates after each XPath step. However, many such operations are redundant. In this paper, we present a systematic approach to remove unnecessary sorting and duplicate elimination operations in path expressions in XQuery 1.0. The technique uses an automaton-based algorithm which we have applied successfully to path expressions within a complete XQuery implementation. Experimental results show that the algorithm detects and eliminates most redundant sorting and duplicate elimination operators and is very effective on common XQuery path expressions.
منابع مشابه
Avoiding Unnecessary Ordering Operations in XPath
We present a sound and complete rule set for determining whether sorting by document order and duplicate removal operations in the query plan of XPath expressions are unnecessary. Additionally we define a deterministic automaton that illustrates how these rules can be translated into an efficient algorithm. This work is an important first step in the understanding and tackling of XPath/XQuery o...
متن کاملStructure and Value Synopses for XML Data Graphs
All existing proposals for querying XML (e.g., XQuery) rely on a pattern-specification language that allows (1) path navigation and branching through the label structure of the XML data graph, and (2) predicates on the values of specific path/branch nodes, in order to reach the desired data elements. Optimizing such queries depends crucially on the existence of concise synopsis structures that ...
متن کاملXQuery Translation to Sem-SQL
XML query translation is an inevitable step involved in using non-XML databases storing XML data. In this paper, we address the XQuery to Sem-SQL translation issue, part of the XML storage and retrieval using the Semantic Binary Object-Oriented Database System (Sem-ODB) project, by providing a high-level description of the translation scheme between XQuery and Sem-SQL. Our translation scheme ut...
متن کاملEfficient XQuery Evaluation of Grouping Conditions with Duplicate Removals
Currently, grouping in XQuery must be expressed implicitly with nested FLWOR expressions. With XQuery 1.1, an explicit group by clause will be part of this query language. As users integrate this new construct into their applications, it becomes important to have efficient evaluation techniques available to process even complex grouping conditions. Among them, the removal of distinct values or ...
متن کاملSorting, Grouping and Duplicate Elimination in the Advanced Information Management Prototype
Sorting, duplicate suppression and grouping are important operations in relational database management systems. This paper is devoted to the related language features and their implementation in the Advanced Information Management Prototype AIM-P. The query language HDBL is an SQL-like database language supporting the extended NF* data model. The proposed language extensions follow the classica...
متن کامل