Time-Stamping Algorithms for Parallelization of Loops at Run-Time
نویسندگان
چکیده
In this paper, we present two new run-time algorithms for the parallelization of loops that have indirect access patterns. The algorithms can handle any type of loop-carried dependencies. They follow the INSPECTOR/EXECUTOR scheme and improve upon previous algorithms with the same generality by allowing concurrent reads of the same location and by increasing the overlap of dependent iterations. The algorithms are based on time-stamping rules and implemented using multithreading tools. The experimental results on an SMP server with four processors show that our schemes are efficient and outperform their competitors consistently in all test cases. The difference between the two proposed algorithms is that one allows partially concurrent reads without causing extra overhead in its inspector, while the other allows fully concurrent reads at a slight overhead in the dependence analysis. The algorithm allowing fully concurrent reads obtains up to an 80% improvement over its competitor.
منابع مشابه
Effects of Parallelism Degree on Run-Time Parallelization of Loops
Due to the overhead for exploiting and managing parallelism, run-time loop parallelization techniques with the aim of maximizing parallelism may not necessarily lead to the best performance. In this paper, we present two parallelization techniques that exploit different degrees of parallelism for loops with dynamic crossiteration dependences. The DOALL approach exploits iterationlevel paralleli...
متن کاملHardware for Speculative Run-Time Parallelization in Distributed Shared-Memory Multiprocessors
Run-time parallelization is often the only way to execute the code in parallel when data dependence information is incomplete at compile time. This situation is common in many important applications. Unfortunately, known techniques for run-time parallelization are often computationally expensive or not general enough. To address this problem, we propose new hardware support for e cient run-time...
متن کاملEfficient parallelization of the genetic algorithm solution of traveling salesman problem on multi-core and many-core systems
Efficient parallelization of genetic algorithms (GAs) on state-of-the-art multi-threading or many-threading platforms is a challenge due to the difficulty of schedulation of hardware resources regarding the concurrency of threads. In this paper, for resolving the problem, a novel method is proposed, which parallelizes the GA by designing three concurrent kernels, each of which running some depe...
متن کاملThe LRPD Test: Speculative Run–Time Parallelization of Loops with Privatization and Reduction Parallelization
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops arise frequently in practice, we advocate a novel framework for their identification: speculatively execute the loop as a doall, and apply a fully parallel data dependence test to determine if it ha...
متن کاملA Feasibility Study of Hardware Speculative Parallelization in Snoop-Based Multiprocessors
Run-time parallelization is a technique for par-allelizing programs with data access patterns dif-cult to analyze at compile time. In this paper we examine the hardware implementation of a run-time parallelization scheme, called speculative parallelization, on snoop-based multiproces-sors. The implementation is based on the idea of embedding dependence checking logic into the cache controller o...
متن کامل