نتایج جستجو برای: instruction fetch

تعداد نتایج: 42508  

2002
Resit Sendag David J. Lilja Steven R. Kunkel

As the degree of instruction-level parallelism in superscalar architectures increases, the gap between processor and memory performance continues to grow requiring more aggressive techniques to increase the performance of the memory system. We propose a new technique, which is based on the wrong-path execution of loads far beyond instruction fetch-limiting conditional branches, to exploit more ...

2002
Weiyu Tang Alexander V. Veidenbaum Alexandru Nicolau Rajesh K. Gupta

In this paper, we present a Branch Target Buuer (BTB) design for energy savings in set-associative instruction caches. We extend the functionality of a BTB by caching way predictions in addition to branch target addresses. Way prediction and branch target prediction are done in parallel. Instruction cache energy savings are achieved by accessing one cache way if the way prediction for a fetch i...

Journal: :Microprocessors and Microsystems - Embedded Hardware Design 1998
Narayan Ranganathan Manoj Franklin

This paper presents a microarchitecture based on exploiting the locality of data dependences for e ciently executing many instructions per cycle. The instruction window is split into multiple hardware units, and the instruction stream is distributed among them in such a way that data dependent instructions are generally allocated to the same unit. The fetch bandwidth of the processor is enhance...

1997
Bill Appelbe Reid Harmon Maurizio Vitale Sri Doddapaneni Scott Wills

The performance and hardware complexity of superscalar architectures is hindered by conditional branch instructions. When conditional branches are encountered in a program, the instruction fetch unit must rapidly predict the branch predicate and begin speculatively fetching instructions with no loss of instruction throughput. Speculative execution increases hardware cost, since speculative inst...

2008
Handong Ye Ge Gan Ziang Hu Guang R. Gao Xiaomi An

A SMT processor can fetch and issue instructions from multiple independent hardware threads at every CPU cycle. Therefore, hardware resources are shared among the concurrently-running threads at a very fine grain level, which can increase the utilization of processor pipeline. However, the concurrently-running threads in a SMT processor may interfere with each other and stall the CPU pipeline. ...

2000
Dana S. Henry Bradley C. Kuszmaul Gabriel H. Loh

Our program benchmarks and simulations of novel circuits indicate that large-window processors are feasible. Using our redesigned superscalar components, a large-window processor implemented in today’s technology can achieve an increase of 10–60% (geometric mean of 31%) in program speed compared to today’s processors. The processor operates at clock speeds comparable to today’s processors, but ...

1995
Linley Gwennap

Intel’s P6 processor (see 090202.PDF) is the first to use a two-level branch-prediction algorithm to improve accuracy. This algorithm, first published by Tse-Yu Yeh and Yale Patt, has the potential to push accuracy well beyond the 90% level achieved by the best processors today. As future processors look to improve performance by increasing the issue rate and/or extending the pipeline depth, th...

1991
Matthew K. Farrens Andrew R. Pleszkun

The PIPE processor is an outgrowth of the PIPE Project, a research project at the University of Wisconsin-Madison whose goal was to investigate computer architectures that would be well suited to VLSI implementation. The implemented PIPE processor is a 32-bit pipelined single chip processor with a simplified load-store instruction set, a 5 stage pipeline, a two-cycle ALU, and the following uniq...

2009
Hadi Esmaeilzadeh Doug Burger

Predication of control edges has the potential advantages of improving fetch bandwidth and reducing branch mispredictions. However, heavily predicated code in out-of-order processors can lose significant performance by deferring resolution of the predicates until they are executed, whereas in nonpredicated code those control arcs would have remained as branches, and would be resolved immediatel...

1999
Shu-Lin HWANG Che-Chun CHEN

Modern micro-architectures employ superscalar techniques to enhance system performance. Since the superscalar microprocessors must fetch at least one instruction cache line at a time to support high issue rate and large amount speculative executions. There are cases that multiple branches are often encountered in one cycle. And in practical implementation this would cause serious problem while ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید