Dualheap Selection Algorithm: Efficient, Inherently Parallel and Somewhat Mysterious
نویسنده
چکیده
An inherently parallel algorithm is proposed that efficiently performs selection: finding the K-th largest member of a set of N members. Selection is a common component of many more complex algorithms and therefore is a widely studied problem. Not much is new in the proposed dualheap selection algorithm: the heap data structure is from J.W.J. Williams [1], the bottom-up heap construction is from R.W. Floyd [2], and the concept of a two heap data structure is from J.W.J. Williams and D.E. Knuth [3]. The algorithm’s novelty is limited to relatively minor implementation twists: • the two heaps are oriented with their roots at the partition values rather than at the minimum and maximum values, • the coding of one of the heaps (the heap of smaller values) employs negative indexing, • the exchange phase of the algorithm is similar to a bottom-up heap construction, but navigates the heap with a post-order tree traversal. When run on a single processor, the dualheap selection algorithm’s performance is competitive with quickselect with median estimation, a common variant of C.A.R. Hoare’s quicksort algorithm [4]. When run on parallel processors, the dualheap selection algorithm is superior due to its subtasks that are easily partitioned and innately balanced. 1. ALGORITHM OVERVIEW A heap is an array with elements regarded as nodes in a complete binary tree, where node j is the parent of nodes 2j and 2j+1, and where the value at each parent node is superior to the values at its children's nodes. This superiority of all the parent nodes is commonly called the heap condition. The dualheap selection algorithm consists of three phases that are roughly equivalent in terms of the number of comparisons and moves they perform: 1) the whole heap construction phase, 2) the split heap construction phase, 3) the exchange phase. Phase 1 is a bottom-up heap construction as described in many algorithm textbooks such as “Algorithms” by Sedgewick [5]. Although not strictly required, this initial bottom-up heap construction typically reduces the total number of comparison and move operations by about 10%. Phase 2 consists of two more bottom-up heap constructions that split the original set of N members into a size K heap and a size N K heap. It is worth noting that the partition point in the dualheap selection algorithm is set just once, based upon the selected values of N and K rather than the values being partitioned. Phase 3 repeatedly exchanges subtrees between the two heaps until all the values in the size K heap of larger values are not smaller than any of the values in the size N K heap of smaller values. Symbolically, the downward pointing triangle in Figure 1-1 represents the size K heap of larger values, the upward pointing triangle represents the size N K heap of smaller values, and the arrow shows the direction of increasing values and increasing heap indices, indicating that the size N K heap of smaller values employs negative indices. Phase 3 keeps exchanging values between the two heaps until the heap value ranges no longer overlap. At that point, the K-th largest member of the original set of N is at the root of the size K heap of larger values. pre phase 3 post phase 3
منابع مشابه
Dualheap Sort Algorithm: An Inherently Parallel Generalization of Heapsort
A generalization of the heapsort algorithm is proposed. At the expense of about 50% more comparison and move operations for typical cases, the dualheap sort algorithm offers several advantages over heapsort: improved cache performance, better performance if the input happens to be already sorted, and easier parallel implementations. 1. ALGORITHM OVERVIEW A heap is an array with elements regarde...
متن کاملA Dualheap Selection Algorithm - A Call for Analysis
An algorithm is presented that efficiently solves the selection problem: finding the k-th smallest member of a set. Relevant to a divide-and-conquer strategy, the algorithm also partitions a set into small and large valued subsets. Applied recursively, this partitioning results in a sorted set. The algorithm’s applicability is therefore much broader than just the selection problem. The presente...
متن کاملAN EFFICIENT OPTIMIZATION PROCEDURE BASED ON CUCKOO SEARCH ALGORITHM FOR PRACTICAL DESIGN OF STEEL STRUCTURES
Different kinds of meta-heuristic algorithms have been recently utilized to overcome the complex nature of optimum design of structures. In this paper, an integrated optimization procedure with the objective of minimizing the self-weight of real size structures is simply performed interfacing SAP2000 and MATLAB® softwares in the form of parallel computing. The meta-heuristic algorithm chosen he...
متن کاملGGA: A Gender-Based Genetic Algorithm for the Automatic Configuration of Algorithms
Tuning parameters is a problem that is inherent to the development and efficient use of solvers. We propose a robust, inherently parallel genetic algorithm for the problem of configuring solvers automatically. In order to cope with the high costs of evaluating the fitness of individuals, we introduce a gender separation whereby we apply different selection pressure on both genders. Experimental...
متن کاملA Parallel Genetic Algorithm Based Method for Feature Subset Selection in Intrusion Detection Systems
Intrusion detection systems are designed to provide security in computer networks, so that if the attacker crosses other security devices, they can detect and prevent the attack process. One of the most essential challenges in designing these systems is the so called curse of dimensionality. Therefore, in order to obtain satisfactory performance in these systems we have to take advantage of app...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/0706.2155 شماره
صفحات -
تاریخ انتشار 2007