Principles of Speculative Run{time Parallelization 1 Run-time Optimization Is Necessary
نویسنده
چکیده
Current parallelizing compilers cannot identify a signiicant fraction of parallelizable loops because they have complex or statically insuuciently deened access patterns. We advocate a novel framework for the identiication of parallel loops. It speculatively executes a loop as a doall and applies a fully parallel data dependence test to check for any unsatissed data dependencies; if the test fails, then the loop is re{executed serially. We will present the principles of the design and implementation of a compiler that employs both run-time and static techniques to parallelize dynamic applications. Run-time optimizations always represent a tradeoo between a speculated potential beneet and a certain (sure) overhead that must be paid. We will introduce techniques that take advantage of classic compiler methods to reduce the cost of run-time optimization thus tilting the outcome of speculation in favor of signiicant performance gains. Experimental results from the PERFECT, SPEC and NCSA Benchmark suites show that these techniques yield speedups not obtainable by any other known method. To achieve a high level of performance for a particular program on today's super-computers, software developers are often forced to tediously hand{code optimizations tailored to a speciic machine. Such hand{coding is diicult, increases the possibility of error over sequential programming, and the resulting code may not be portable to other machines. Restructuring, or parallelizing, compilers address these problems by detecting and exploiting parallelism in sequential programs written in conventional languages. Although compiler techniques for the automatic detection of parallelism have been studied extensively over the last two decades, current parallelizing compilers cannot extract a signiicant fraction of the available parallelism in a loop if it has a complex and/or statically insuu-ciently deened access pattern. Typical examples are complex simulations such
منابع مشابه
Run-time Parallelization Optimization Techniques ? 1 Run-time Parallelization Requires Compiler Analysis
In this paper we rst present several compiler techniques to reduce the overhead of run-time parallelization. We show how to use static control ow information to reduce the number of memory references that need to be traced at run-time. Then we introduce several methods designed speciically for the parallelization of sparse applications. We detail some heuristics on how to speculate on the type ...
متن کاملA Feasibility Study of Hardware Speculative Parallelization in Snoop-Based Multiprocessors
Run-time parallelization is a technique for par-allelizing programs with data access patterns dif-cult to analyze at compile time. In this paper we examine the hardware implementation of a run-time parallelization scheme, called speculative parallelization, on snoop-based multiproces-sors. The implementation is based on the idea of embedding dependence checking logic into the cache controller o...
متن کاملDynamic and Speculative Polyhedral Parallelization of Loop Nests Using Binary Code Patterns
Speculative parallelization is a classic strategy for automatically parallelizing codes that cannot be handled at compile-time due to the use of dynamic data and control structures. Another motivation of being speculative is to adapt the code to the current execution context, by selecting at run-time an efficient parallel schedule. However, since this parallelization scheme requires on-the-fly ...
متن کاملTechniques for Reducing the Overhead of Run-Time Parallelization
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops arise frequently in practice, we have introduced a novel framework for their identification: speculative parallelization. While we have previously shown that this method is inherently scalable its p...
متن کاملHardware for Speculative Run-Time Parallelization in Distributed Shared-Memory Multiprocessors
Run-time parallelization is often the only way to execute the code in parallel when data dependence information is incomplete at compile time. This situation is common in many important applications. Unfortunately, known techniques for run-time parallelization are often computationally expensive or not general enough. To address this problem, we propose new hardware support for e cient run-time...
متن کامل