Is There Exploitable Thread-Level Parallelism in General-Purpose Application Programs?
نویسنده
چکیده
Most of the thread-level parallelism (TLP) being successfully exploited so far has been primarily from scientific application programs, in particular, floating-point programs. General-purpose applications, especially those written in C or C++, such as the benchmarks in SPECint2000, have primarily been exploiting only instruction-level parallelism (ILP). A lot of research has been done recently on multiprocessors-on-a-chip (often called ”multithreaded processors”) because VLSI technology today allows multiple processor cores to be implemented on a single chip. An interesting question has arisen as to how much TLP and ILP could be exploited in general-purpose application programs so such multithreaded processors could become the main work horses of future computer systems. In this talk, we will discuss the program characteristics which make it so difficult to exploit TLP in these general-purpose application programs, and will present several machine models and simulation techniques for studying how much TLP and ILP could be exploited with these models. We will also present some measurements on program characteristics important to the design of such multithreaded processors and their compilers. 0-7695-1926-1/03/$17.00 (C) 2003 IEEE Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’03)
منابع مشابه
Software and Hardware for Exploiting Speculative Parallelism with a Multiprocessor
Thread-level speculation (TLS) makes it possible to parallelize general purpose C programs. This paper proposes software and hardware mechanisms that support speculative thread-level execution on a single-chip multiprocessor. A detailed analysis of programs using the TLS execution model shows a bound on the performance of a TLS machine that is promising. In particular, TLS makes it feasible to ...
متن کاملThe Superthreaded Processor Architecture
The common single-threaded execution model limits processors to exploiting only the relatively small amount of instruction-level parallelism available in application programs. The superthreaded processor, on the other hand, is a concurrent multithreaded architecture (CMA) that can exploit the multiple granularities of parallelism available in general-purpose application programs. Unlike other C...
متن کاملIntegrating Parallelizing Compilation Technology and Processor Architecture for Cost-Effective Concurrent multithreading
As the number of transistors on a single chip continues to grow, it is important to think beyond the traditional approaches of compiler optimizations for deeper pipelines and wider instruction issue units to improve performance. This single-threaded execution model limits these approaches to exploiting only the relatively small amount of instruction-level parallelism available in application pr...
متن کاملDetecting the Existence of Coarse-Grain Parallelism in General-Purpose Programs
With the rise of chip-multiprocessors, the problem of parallelizing general-purpose programs has once again been placed on the research agenda. In the 1980s and early 1990s, great successes were obtained to extract parallelism from the inner loops of scientific computations. General-purpose programs, however, stayed out-of-reach due to the complexity of their control flow and data dependences. ...
متن کاملAutomatic CUDA Code Synthesis Framework for Multicore CPU and GPU Architectures
Recently, general purpose GPU (GPGPU) programming has spread rapidly after CUDA was first introduced to write parallel programs in high-level languages for NVIDIA GPUs. While a GPU exploits data parallelism very effectively, task-level parallelism is exploited as a multi-threaded program on a multicore CPU. For such a heterogeneous platform that consists of a multicore CPU and GPU, in this pape...
متن کامل