Run-time Parallelization Optimization Techniques ? 1 Run-time Parallelization Requires Compiler Analysis
نویسنده
چکیده
In this paper we rst present several compiler techniques to reduce the overhead of run-time parallelization. We show how to use static control ow information to reduce the number of memory references that need to be traced at run-time. Then we introduce several methods designed speciically for the parallelization of sparse applications. We detail some heuristics on how to speculate on the type and data structures used by the original code and thus reduce the memory requirements for tracing the sparse access patterns without performing any additional work. Optimization techniques for the sparse reduction parallelization and speculative loop distribution conclude the paper. Current parallelizing compilers cannot identify a signiicant fraction of paralleliz-able loops because they have complex or statically insuuciently deened access patterns. To ll this gap we advocate a novel framework for their identiica-tion: speculatively execute the loop as a doall, and apply a fully parallel data dependence test to determine if it had any cross{processor dependences; if the test fails, then the loop is re{executed serially. While this method is inherently scalable its practical success depends on the fraction of ideal speedup that can be obtained on modest to moderately large parallel machines. Maximizing the resulting parallelism can be obtained only through a minimization of the run-time overhead of the method, which in turn depends on its level of integration within a restructuring compiler. This technique (the LRPD test) and related issues have been presented in detail in 3, 4] and thus will not be presented here. We describe a compiler technique that reduces the number of memory references that have to be collected at run-time by using static control ow information. With this technique we can remove the shadowing of many references ? A full version of this paper is available as
منابع مشابه
Run-time Parallelization Techniques for Sparse Applications
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops arise frequently in practice, we have introduced a novel framework for their identification: speculative parallelization. While we have previously shown that this method is inherently scalable its p...
متن کاملTechniques for Reducing the Overhead of Run-Time Parallelization
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops arise frequently in practice, we have introduced a novel framework for their identification: speculative parallelization. While we have previously shown that this method is inherently scalable its p...
متن کاملHybrid Dependence Analysis for Automatic Parallelization
Automatic program parallelization has been an elusive goal for many years. It has recently become more important due to the widespread introduction of multi-cores in PCs. Automatic parallelization could not be achieved because classic compiler analysis was neither powerful enough and program behavior was found to be in many cases input dependent. Run-time thread level parallelization, introduce...
متن کاملHardware for Speculative Run-Time Parallelization in Distributed Shared-Memory Multiprocessors
Run-time parallelization is often the only way to execute the code in parallel when data dependence information is incomplete at compile time. This situation is common in many important applications. Unfortunately, known techniques for run-time parallelization are often computationally expensive or not general enough. To address this problem, we propose new hardware support for e cient run-time...
متن کاملDynamic and Speculative Polyhedral Parallelization of Loop Nests Using Binary Code Patterns
Speculative parallelization is a classic strategy for automatically parallelizing codes that cannot be handled at compile-time due to the use of dynamic data and control structures. Another motivation of being speculative is to adapt the code to the current execution context, by selecting at run-time an efficient parallel schedule. However, since this parallelization scheme requires on-the-fly ...
متن کامل