instruction fetch

Latency Tolerant Branch Predictors

2001

Oliverio J. Santana Alex Ramirez

The access latency of branch predictors is a well known problem of fetch engine design. Prediction overriding techniques are commonly accepted to overcome this problem. However, prediction overriding requires a complex recovery mechanism to discard the wrong speculative work based on overridden predictions. In this paper, we show that stream and trace predictors, which use long basic prediction...

متن کامل

Versatile Processor Design for Efficiency and High Performance

2000

Sotirios G. Ziavras

We present new architectural concepts for uniprocessor designs that conform to the data-driven computation paradigm. Usage of our D-CPU (Data-Driven processor) follows the natural flow of programs, minimizes the number of redundant operations, lowers the hardware cost, and reduces the power consumption. Instead of giving the CPU the privileged right of deciding what instructions to fetch in eac...

متن کامل

Improving the WCET computation in the presence of a lockable instruction cache in multitasking real-time systems

Journal: :Journal of Systems Architecture - Embedded Systems Design 2011

Luis C. Aparicio Juan Segarra Clemente Rodríguez Lafuente Víctor Viñals

In multitasking real-time systems it is required to compute the WCET of each task and also the effects of interferences between tasks in the worst case. This is very complex with variable latency hardware, such as instruction cache memories, or, to a lesser extent, the line buffers usually found in the fetch path of commercial processors. Some methods disable cache replacement so that it is eas...

متن کامل

Program decision logic optimization using predication and control speculation

2001

Wen-mei W. Hwu David I. August John W. Sias

The mainstream arrival of predication, a means other than branching of selecting instructions for execution, has required compiler architects to reformulate fundamental analyses and transformations. Traditionally, the compiler has generated branches straightforwardly to implement control flow designed by the programmer and has then performed sophisticated “global” optimizations to move and opti...

متن کامل

The Impact of Resource Sharing Control on the Design of Multicore Processors

2009

Chen Liu Jean-Luc Gaudiot

One major obstacle faced by designers when entering the multicore era is how to harness the massive computing power which these cores provide. Since Instructional-Level Parallelism (ILP) is inherently limited, one single thread is not capable of efficiently utilizing the resource of a single core. Hence, Simultaneous MultiThreading (SMT) microarchitecture can be introduced in an effort to achie...

متن کامل

Integrated I cache Way Predictor and Branch Target Bu er to Reduce Energy Consumption

2002

Weiyu Tang Alexander V Veidenbaum Alexandru Nicolau Rajesh Gupta

In this paper we present a Branch Target Bu er BTB design for energy savings in set associative in struction caches We extend the functionality of a BTB by caching way predictions in addition to branch target addresses Way prediction and branch target prediction are done in parallel Instruction cache energy savings are achieved by accessing one cache way if the way pre diction for a fetch is av...

متن کامل

High Performance and Energy Efficient Serial Prefetch Architecture

2002

Glenn Reinman Brad Calder Todd M. Austin

Energy efficient architecture research has flourished recently, in an attempt to address packaging and cooling concerns of current microprocessor designs, as well as battery life for mobile computers. Moreover, architects have become increasingly concerned with the complexity of their designs in the face of scalability, verification, and manufacturing concerns. In this paper, we propose and eva...

متن کامل

Using Branch Prediction Information for Near-Optimal I-Cache Leakage

2006

Sung Woo Chung Kevin Skadron

This paper describes a new on-demand wakeup prediction policy for instruction cache leakage control that achieves better leakage savings than prior policies, and avoids the performance overheads of prior policies. The proposed policy reduces leakage energy by more than 92% with only less than 0.3% performance overhead on average. The key to this new on-demand policy is to use branch prediction ...

متن کامل

A NoC-based hybrid message-passing/shared-memory approach to CMP design

Journal: :Microprocessors and Microsystems - Embedded Hardware Design 2011

Mario R. Casu Massimo Ruo Roch Sergio Tota Maurizio Zamboni

Future chip-multiprocessors (CMP) will integrate many cores interconnected with a high-bandwidth and low-latency scalable network-on-chip (NoC). However, the potential that this approach offers at the transport level needs to be paired with an analogous paradigm shift at the higher levels. In particular, the standard shared-memory programming model fails to address the requirements of scalabili...

متن کامل

Processor design based on dataflow concurrency

Journal: :Microprocessors and Microsystems 2003

Sotirios G. Ziavras

This paper presents new architectural concepts for uniprocessor system designs. They result in a uniprocessor design that conforms to the data-driven (i.e., dataflow) computation paradigm. It is shown that usage of this, namely D-CPU (Data-Driven) processor, follows the natural flow of programs, minimizes redundant (micro)operations, lowers the hardware cost, and reduces the power consumption. ...

متن کامل