An Inspector-Executor Algorithm for Irregular Assignment Parallelization
نویسندگان
چکیده
A loop with irregular assignment computations contains loopcarried output data dependences that can only be detected at run-time. In this paper, a load-balanced method based on the inspector-executor model is proposed to parallelize this loop pattern. The basic idea lies in splitting the iteration space of the sequential loop into sets of conflictfree iterations that can be executed concurrently on different processors. As will be demonstrated, this method outperforms existing techniques. Irregular access patterns with different load-balancing and reusability properties are considered in the experiments.
منابع مشابه
Run-Time Parallelization of Irregular DOACROSS Loops
Dependencies between iterations of loop structures cannot always be determined at compile-time because they may depend on input data which is known only at run-time. A prime example is a loop accessing an array where the array indices are themselves functions of another array determined only at run-time. To parallelize such loops, it is necessary to perform a run-time analysis. We describe a ne...
متن کاملTime Stamp Algorithms for Runtime Parallelization of DOACROSS Loops with Dynamic Dependences
ÐThis paper presents a time stamp algorithm for runtime parallelization of general DOACROSS loops that have indirect access patterns. The algorithm follows the INSPECTOR/EXECUTOR scheme and exploits parallelism at a fine-grained memory reference level. It features a parallel inspector and improves upon previous algorithms of the same generality by exploiting parallelism among consecutive reads ...
متن کاملAutomatic Parallelizing Compiler for Distributed Memory Parallel Computers: New Algorithms to Improve the Performance of the Inspector/executor
متن کامل
Parallelization Techniques for Sparse Matrix Applications
Sparse matrix problems are diicult to parallelize eeciently on distributed memory machines since data is often accessed indirectly. Inspector/executor strategies, which are typically used to parallelize loops with indirect references, incur substantial run-time preprocessing overheads when references with multiple levels of indirection are encountered | a frequent occurrence in sparse matrix al...
متن کاملExtending the Applicability and Improving the Performance of Runtime Parallelization
When static analysis of a sequential loop fails to yield reliable information on its dependence structure, a parallelizing compiler is left with three alternatives: it can take the conservative option of emitting code for a sequential execution; it can optimistically emit code to speculatively execute the loop as a DOALL [6, 7]; or it can emit inspector-executor code to determine the actual dep...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004