Improvement of energy-efficiency in off-chip caches by selective prefetching

نویسندگان

  • Jonas Jalminger
  • Per Stenström
چکیده

In this paper we revisit the line size/performance trade-offs in off-chip second-level caches in light of energy efficiency and make two distinct contributions: Our first observation is that while large blocks, i.e., 128-256 bytes, typically improve performance, they cause a devastating energy dissipation because the limited spatial locality results in a low block utilization. We find that blocks as small as 32 bytes can cut the energy-delay (the execution time/energy used product) by more than a factor of two with only a moderate performance loss that is less than 25%. As one technique to exploit spatial locality, we apply selective software-controlled prefetching to compensate for the moderate performance losses of small blocks. Aided by a performance-tuning tool (in our case study we use SimICS), we identify the memory instructions that contribute to most of the misses. In the seven applications we have studied, typically less than ten memory instructions cause half the number of misses. By inserting prefetch instructions to cover their misses, we show that it is possible to arrive at the performance level of large blocks using a small block. Moreover, the combined effect of prefetching using small blocks yield the same low energy dissipation as small blocks with no prefetching.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance/Energy Efficiency of Variable Line-Size Caches for Intelligent Memory Systems

Integrating main memory (DRAM) and processors into a single chip, or merged DRAM/logic LSI, makes it possible to exploit high on-chip memory bandwidth by widening on-chip bus and on-chip DRAM array. In addition, from energy consumption point of view, the integration brings a significant improvement by decreasing the number of off-chip accesses. For merged DRAM/logic LSIs having on-chip cache me...

متن کامل

Performance Issues in Integrating Temporality-Based Caching with Prefetching

This work evaluates the performance eeectiveness of combining two techniques for improving cache hit rate and reducing memory traac in small on-chip direct-mapped caches. Temporality-based caching is an eecient technique for reducing unnecessary cache block connicts in direct-mapped caches, but does not address compulsory misses. Tagged prefetching is a known technique for controlling compulsor...

متن کامل

Energy-Efficiency of VLSI Caches: A Comparative Study

We investigate the use of organizational alternatives that lead to more energy–efficient caches for contemporary microprocessors. Dissipative transitions are likely to be highly correlated and skewed in caches, precluding the use of simplistic hit/miss ratio based power dissipation models for accurate power estimations. We use a detailed register–level simulator for a typical pipelined CPU and ...

متن کامل

Instruction Prefetching of Systems Codes with Layout Optimized for Reduced Cache Misses 1

High-performing on-chip instruction caches are crucial to keep fast processors busy. Unfortunately, while on-chip caches are usually successful at intercepting instruction fetches in loop-intensive engineering codes, they are less able to do so in large systems codes. To improve the performance of the latter codes, the compiler can be used to lay out the code in memory for reduced cache connict...

متن کامل

A Performance Study of Instruction Cache Prefetching Methods

Prefetching methods for instruction caches are studied via trace-driven simulation. The two primary methods are “fallthrough” prefetch (sometimes referred to as “one block lookahead”) and “target” prefetch. Fall-through prefetches are for sequential line accesses, and a key parameter is the distance from the end of the current line where the prefetch for the next line is initiated. Target prefe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Microprocessors and Microsystems

دوره 26  شماره 

صفحات  -

تاریخ انتشار 2002