Fast and Efficient Partial Code Reordering: Taking Advantage of Dynamic Recompilation
نویسندگان
چکیده
Poor instruction cache locality can degrade performance on modern architectures. For example, our simulation results show that eliminating all instruction cache misses improves performance by as much as 16% for a modestly sized instruction cache. In this paper, we show how to take advantage of dynamic code generation in a Java Virtual Machine (VM) to improve instruction locality at run-time. We develop a dynamic code reordering (DCR) system; a low overhead, online approach for improving instruction locality. DCR has three optimizations: (1) Interprocedural method separation; (2) Intraprocedural code splitting; and (3) Code padding. DCR uses the dynamic call graph and an edge profile that most VMs already collect to separate hot/cold methods and hot/cold code within a method. It also puts padding between methods to minimize conflict misses between frequent caller/callee pairs. It incrementally performs these optimizations only when the VM is optimizing a method at a higher level. We implement DCR in Jikes RVM and show its overhead is negligible. Extensive simulation and run-time experiments show that a simple code space improves average performance on a Pentium 4 by around 6% on SPEC and DaCapo Java benchmarks. These programs however have very small instruction cache footprints that limit opportunities for DCR to improve performance. Consequently, DCR optimizations on average show little effect, sometimes degrading performance and occasionally improving performance by up to 5%. Our work shows that the VM has the potential to dynamically improve instruction locality incrementally by simply piggybacking on hotspot recompilation.
منابع مشابه
Fast Recompilation of Object Oriented Modules
Once a program file is modified, the recompilation time should be minimized, without sacrificing execution speed or high level object oriented features. The recompilation time is often a problem for the large graphical interactive distributed applications tackled by modern OO languages. A compilation server and fast code generator were developed and integrated with the SRC Modula-3 compiler and...
متن کاملAn Infrastructure for Profile-Driven Dynamic Recompilation
Dynamic optimization of computer programs can dramatically improve their performance on a variety of applications. This paper presents an efficient infrastructure for dynamic recompilation that can support a wide range of dynamic optimizations including profile-driven optimizations. The infrastructure allows any section of code to be optimized and regenerated on-the-fly, even code for currently...
متن کاملEfficient Compilation and Profile-driven Dynamic Recompilation in Scheme
This dissertation presents a fast and effective linear intraprocedural register allocation strategy and an infrastructure for profile-driven dynamic recompilation in Scheme. The register allocation strategy optimizes register usage across procedure calls. It capitalizes on our observation that while procedures that do not contain calls (syntactic leaf routines) account for under one third of al...
متن کاملAn Efficient Execution Model for Dynamically Reconfigurable Component Software∗ Position Paper
Traditional, statically composed software can be globally optimized for execution speed at compile-time. For dynamically composed component software, a different execution model has to be employed to achieve comparable performance. As components are developed and compiled separately, little is known at compile-time about the environment they will be used in later. The global information require...
متن کاملDynamic Path Profile Aided Recompilation in a JAVA Just-In-Time Compiler
Just-in-Time (JIT) compilers for Java can be augmented by making use of runtime profile information to produce better quality code and hence achieve higher performance. In a JIT compilation environment, the profile information obtained can be readily exploited in the same run to aid recompilation and optimization of frequently executed (hot) methods. This paper discusses a low overhead path pro...
متن کامل