A Hardware-Driven Pro ling Scheme for Identifying Program Hot Spots to Support Runtime Optimization
نویسندگان
چکیده
This paper presents a novel hardware-based approach for identifying, pro ling, and monitoring hot spots in order to support runtime optimization of generalpurpose programs. The proposed approach consists of a set of tightly coupled hardware tables and control logic modules that are placed in the retirement stage of a processor pipeline removed from the critical path. The features of the proposed design include rapid detection of program hot spots after changes in execution behavior, runtime-tunable selection criteria for hot spot detection, and negligible overhead during application execution. Experiments using several SPEC95 benchmarks, as well as several large WindowsNT applications, demonstrate the promise of the proposed de-
منابع مشابه
A Hardware - Driven Pro ling Scheme for Identifying Program HotSpots to Support Runtime
This paper presents a novel hardware-based approach for identifying, prooling, and monitoring hot spots in order to support runtime optimization of general-purpose programs. The proposed approach consists of a set of tightly coupled hardware tables and control logic modules that are placed in the retirement stage of a processor pipeline removed from the critical path. The features of the propos...
متن کاملTemporal and Spatial Program Hot Spot Visualization
Certain parts of computer programs are executed more often than others over the course of normal use. The most frequently visited portions of code are known as “hot spots.” It is usually in these regions where most optimization effort should be focussed. The problem of locating and identifying program hot spots is related to that of detecting program phase changes. Different phases in a program...
متن کاملExploiting Hardware Performance Counters with Flow and Context Sensitive Pro ling
A program pro le attributes run-time costs to portions of a program's execution. Most pro ling systems su er from two major de ciencies: rst, they only apportion simple metrics, such as execution frequency or elapsed time to static, syntactic units, such as procedures or statements; second, they aggressively reduce the volume of information collected and reported, although aggregation can hide ...
متن کاملCOBRA: A Framework for Continuous Profiling and Binary Re-Adaptation
Dynamic optimizers have shown to improve performance and power efficiency of single-threaded applications. Multithreaded applications running on CMP, SMP and cc-NUMA systems also exhibit opportunities for dynamic binary optimization. Existing dynamic optimizers lack efficient monitoring schemes for multiple threads to support appropriate thread specific or system-wide optimization for a collect...
متن کاملARS: an adaptive runtime system for locality optimization
Shared memory programs running on Non-Uniform Memory Access (NUMA) machines usually face inherent performance problems stemming from excessive remote memory accesses. A solution, called the Adaptive Runtime System (ARS), is presented in this paper. ARS is designed to adjust the data distribution at runtime through automatic page migrations. It uses memory access histograms gathered by hardware ...
متن کامل