Double Inspection for Run-Time Loop Parallelization
نویسندگان
چکیده
The Inspector/Executor is well-known for parallelizing loops with irregular access patterns that cannot be analyzed statically. The downsides of existing inspectors are that it is hard to amortize their high run-time overheads by actually executing the loop in parallel, that they can only be applied to loops with dependencies that do not change during their execution and that they are often specifically designed for array codes and are in general not applicable in object oriented just-intime compilation. In this paper we present an inspector that inspects a loop twice to detect if it is fully parallelizable. It works for arbitrary memory access patterns, is conservative as it notices if changing data dependencies would cause errors in a potential parallel execution, and most importantly, as it is designed for current multicore architectures it is fast – despite of its double inspection effort: it pays off at its first use. On benchmarks we can amortize the inspection overhead and outperform the sequential version from 2 or 3 cores onward.
منابع مشابه
Quantifier Elimination in Automatic Loop Parallelization
We present an application of quantifier elimination techniques in the automatic parallelization of nested loop programs. The technical goal is to simplify affine inequalities whose coefficients may be unevaluated symbolic constants. The values of these so-called structure parameters are determined at run time and reflect the problem size. Our purpose here is to make the research community of qu...
متن کاملReordering Iterations in Runtime Loop Parallelization Reordering Iterations in Runtime Loop Parallelization
When a loop in a sequential program is parallelized, it is normally guaranteed that all ow dependencies and anti-dependencies are respected so that the result of parallel execution is always the same as sequential execution. In some cases, however, the algorithm implemented by the loop allows the iterations to be executed in a di erent sequential order than the one speci ed in the program. This...
متن کاملThe LRPD Test: Speculative Run–Time Parallelization of Loops with Privatization and Reduction Parallelization
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops arise frequently in practice, we advocate a novel framework for their identification: speculatively execute the loop as a doall, and apply a fully parallel data dependence test to determine if it ha...
متن کاملAffine Parallelization of Loops with Run-Time Dependent Bounds from Binaries
An automatic parallelizer is a tool that converts serial code to parallel code. This is an important tool because most hardware today is parallel and manually rewriting the vast repository of serial code is tedious and error prone. We build an automatic parallelizer for binary code, i.e. a tool which converts a serial binary to a parallel binary. It is important because: (i) most serial legacy ...
متن کاملEfficient parallelization of the genetic algorithm solution of traveling salesman problem on multi-core and many-core systems
Efficient parallelization of genetic algorithms (GAs) on state-of-the-art multi-threading or many-threading platforms is a challenge due to the difficulty of schedulation of hardware resources regarding the concurrency of threads. In this paper, for resolving the problem, a novel method is proposed, which parallelizes the GA by designing three concurrent kernels, each of which running some depe...
متن کامل