Time More speculative Thread 1 Thread 2 Spawned Thread 2 Thread 2 finishes Thread
نویسندگان
چکیده
With the current trend toward multicore architectures, improved execution performance can no longer be obtained via traditional single-thread instruction level parallelism (ILP), but, instead, via multithreaded execution. Generating thread-parallel programs is hard and thread-level speculation (TLS) has been suggested as an execution model that can speculatively exploit thread-level parallelism (TLP) even when thread independence cannot be guaranteed by the programmer/compiler. Alternatively, the helper threads (HT) execution model has been proposed where subordinate threads are executed in parallel with a main thread in order to improve the execution efficiency (i.e., ILP) of the latter. Yet another execution model, runahead execution (RA), has also been proposed where subordinate versions of the main thread are dynamically created especially to cope with long-latency operations, again with the aim of improving the execution efficiency of the main thread. Each one of these multithreaded execution models works best for different applications and application phases. In this paper we combine these three models into a single execution model and single hardware infrastructure such that the system can dynamically adapt to find the most appropriate multithreaded execution model. More specifically, TLS is favored whenever successful parallel execution of instructions in multiple threads (i.e., TLP) is possible and the system can seamlessly transition at run-time to the other models otherwise. In order to understand the tradeoffs involved, we also develop a performance model that allows one to quantitatively attribute overall performance gains to either TLP or ILP in such combined multithreaded execution model. Experimental results show that our unified execution model achieves speedups of up to 41.2%, with an average of 10.2%, over an existing state-of-the-art TLS system and speedups of up to 35.2%, with an average of 18.3%, over a flavor of runahead execution for a subset of the SPEC2000 Int benchmark suite. ∗This work was supported in part by EPSRC under grant EP/G000697/1 and the EC under grant HiPEAC IST-004408. †The author was supported in part by a Wolfson Microelectronics scholarship. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ICS’09, June 8–12, 2009, Yorktown Heights, New York, USA. Copyright 2009 ACM 978-1-60558-498-0/09/06 ...$5.00.
منابع مشابه
Investigating Moisture Management Property of a Bi-layer Fabric Through Nanofiber-coated PET as a Novel Sewing Thread: Vertical Wicking Test
متن کامل
Optimization of the thread take-up lever mechanism in lockstitch sewing machine using the imperialistic competitive algorithm
متن کامل
Thread Pitch Variant in Orthodontic Mini-screws: A 3-D Finite Element Analysis
Orthodontic miniscrews are widely used as temporary anchorage devices to facilitate orthodontic movements. Miniscrew loosening is a common problem, which usually occurs during the first two weeks of treatment. Macrodesign can affect the stability of a miniscrew by changing its diameter, length, thread pitch, thread shape, tapering angle and so on. In this study, a 3-D finite element analysis wa...
متن کاملFabrication of Bovine Serum Albumin-Loaded Coaxial Electrospun Thread with an Aligned Core-Shell Fibrous Structure
متن کامل
Exploiting Speculative TLP in Recursive Programs by Dynamic Thread Prediction
Speculative parallelisation represents a promising solution to speed up sequential programs that are hard to parallelise otherwise. Prior research has focused mainly on parallelising loops. Recursive procedures, which are also frequently used in real-world applications, have attracted much less attention. Moreover, the parallel threads in prior work are statically predicted and spawned. In this...
متن کامل