Hardware/Software Techniques for Assisted Execution Runtime Systems
نویسندگان
چکیده
The increasing complexity of modern and future multi-core/multithreaded processors rises the question of how to best utilize processor resources. On one side, Amdahl’s Law limits the maximum theoretical speedup of parallel applications while, on the other side, the increasing complexity of runtime programming language may introduce implicit serialization points. Several studies demonstrated that it is often more convenient to use some of the hardware threads to assist execution than running supplementary application threads. Assisted execution approaches, however, may lead to low processor utilization: in this paper, we explore fine-grained hardware resource allocation techniques to assign hardware resources to application and auxiliary threads at runtime, according to their actual computing power demand. As a test case, we apply fine-grained resource allocation to STM2, the first parallel STM system that offload STM management operations to auxiliary threads. We implemented an integrated hardware/software solution in which each level performs well-defined tasks efficiently: 1) STM2 is enriched with a runtime mechanism to express computing power requirements of application and auxiliary threads; 2) the hardware enforces dynamic resource partitioning among running threads; 3) the operating system provides a simple and efficient interface between STM2 and the hardware resource allocation mechanism. In this paper, we leverage the IBM POWER7 hardware thread prioritization mechanism to bias the allocation of hardware resources in favor of computing intensive application threads or overloaded auxiliary threads. We test fine-grained resource allocation solutions on a real IBM POWER7 system running a simple and malleable TM benchmark (Eigenbench) and applications from the STAMP benchmark suite. Results show that the proposed integrated solution achieves up to 65% and 11% of performance improvement over the standard STM2 design for Eigenbench and STAMP applications, respectively.
منابع مشابه
Runtime Environment for Dynamically Reconfigurable Embedded Sy
A runtime environment has been developed to enable the seamless integration of different hardware and software implementation technologies (DSP’s, FPGA’s, ASIC’s). The runtime environment is responsible for the management of dynamic system reconfiguration, including software reconfiguration for the parallel DSP’s and hardware reconfiguration for the FPGA’s in the system. This paper describes th...
متن کاملSoftware Techniques for Avoiding Hardware Virtualization Exits
On modern processors, hardware-assisted virtualization outperforms binary translation for most workloads. But hardware virtualization has a potential problem: virtualization exits are expensive. While hardware virtualization executes guest instructions at native speed, guest/VMM transitions can sap performance. Hardware designers attacked this problem both by reducing guest/VMM transition costs...
متن کاملA Hardware-assisted Instruction Security Monitoring Design in Embedded System
This paper presents a series of novel architectural-enhanced security solutions. In the crosscompilation link stage, the automated compiler extracts the intrusion model for instruction code and static data, meanwhile secure tags of each main memory segment are added at the compile time automatically. At runtime, the designed hardware observes its dynamic execution trace and checks whether the t...
متن کاملVerification and analysis of domain-specific models of physical characteristics in embedded control software
Context: A considerable portion of the software systems today are adopted in the embedded control domain. Embedded control software deals with controlling a physical system, and as such models of physical characteristics become part of the embedded control software. Objective: Due to the evolution of system properties and increasing complexity, faults can be left undetected in these models of p...
متن کاملDynamic Identification of Shared Transactional Locations
Hardware TM systems execute user code within an atomic{} delimiter without any instrumentation. Software transactional memory systems require complex sequences of operations to be executed on the memory locations shared by transactions, but typically not on unshared locations, even if these are accessed within the scope of a transaction. Lack of identification of such instructions introduces a ...
متن کامل