Overview of the Pipe Processor Implementation

نویسندگان

Matthew K. Farrens

Andrew R. Pleszkun

چکیده

The PIPE processor is an outgrowth of the PIPE Project, a research project at the University of Wisconsin-Madison whose goal was to investigate computer architectures that would be well suited to VLSI implementation. The implemented PIPE processor is a 32-bit pipelined single chip processor with a simplified load-store instruction set, a 5 stage pipeline, a two-cycle ALU, and the following unique features: (1) Architectural I/O queues that lie between the processor internals and the external memory. These queues are used to reduce the impact of memory delays on processor performance. (2) A delayed branch scheme that allows the compiler to specify the number of instructions after a branch that will be unconditionally executed (between 0-7) based on how well it was able to schedule code. (3) A sophisticated instruction fetch mechanism featuring a small on-chip instruction cache, an instruction queue and an instruction queue buffer that together perform as well as a much larger conventional instruction cache. (4) A register file that is divided into foreground and background registers, to improve the performance of subroutine calls. Extensive simulations of the original design indicated that the features listed above provide significant performance improvements. However, it was felt that this combination of architectural features was sufficiently unique to justify an actual implementation of the processor, to investigate whether they would work as well in practice as in theory. The processor was fabricated by MOSIS in 1.5 micron nMOS, and is 2-3 times faster than the 1.5 micron nMOS versions of the RISC and MIPS chips. This improved performance is due in large part to the presence of the I/O queues, which allow the processor internals to run at a clock rate completely independent of the external memory speed. Many other valuable lessons were learned from the implementation, and a number of new questions have been generated whose resolution is still under investigation. We feel that the benefits from implementing the processor have definitely been worth the effort.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design and Implementation of Field Programmable Gate Array Based Baseband Processor for Passive Radio Frequency Identification Tag (TECHNICAL NOTE)

In this paper, an Ultra High Frequency (UHF) base band processor for a passive tag is presented. It proposes a Radio Frequency Identification (RFID) tag digital base band architecture which is compatible with the EPC C C2/ISO18000-6B protocol. Several design approaches such as clock gating technique, clock strobe design and clock management are used. In order to reduce the area Decimal Matrix C...

متن کامل

Fast Cellular Automata Implementation on Graphic Processor Unit (GPU) for Salt and Pepper Noise Removal

Noise removal operation is commonly applied as pre-processing step before subsequent image processing tasks due to the occurrence of noise during acquisition or transmission process. A common problem in imaging systems by using CMOS or CCD sensors is appearance of the salt and pepper noise. This paper presents Cellular Automata (CA) framework for noise removal of distorted image by the salt an...

متن کامل

Pentium III Processor Implementation Tradeoffs

This paper discusses the implementation tradeoffs of the Pentium III processor. The Pentium III processor implements a new extension of the IA-32 instruction set called the Internet Streaming Single-Instruction, MultipleData (SIMD) Extensions (Internet SSE). The processor is based on the Pentium Pro processor microarchitecture. The initial development goals for the Pentium III processor were ...

متن کامل

Theoretical system-level model for power-performance trade-off in VLSI microprocessor design

This contribution provides a quantitative model of the relation between supply voltage scaling, sustainable cycle time, pipeline depth, instruction level parallelism and power dissipation. The analysis show that there is an optimal sizing of the target supply voltage and pipe stage complexity for power minimization subject to a performance constraint. The behavior of realistic processor impleme...

متن کامل

Numerical and Empirical Investigation of Flow Separation Phenomenon around Semi-buried Pipelines due to Steady Currents

In this paper, in order to understand the flow-pipe interaction more clearly, the variations on flow pattern around semi-buried pipelines due to steady current have physically and numerically been investigated. In physical modeling section, the experiments have been carried out in a flume with 10 meter length, 0.3 meter width and 0.5 meter depth using a P.V.C pipe with 6.35 cm in diameter (for ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1991

Overview of the Pipe Processor Implementation

نویسندگان

چکیده

منابع مشابه

Design and Implementation of Field Programmable Gate Array Based Baseband Processor for Passive Radio Frequency Identification Tag (TECHNICAL NOTE)

Fast Cellular Automata Implementation on Graphic Processor Unit (GPU) for Salt and Pepper Noise Removal

Pentium III Processor Implementation Tradeoffs

Theoretical system-level model for power-performance trade-off in VLSI microprocessor design

Numerical and Empirical Investigation of Flow Separation Phenomenon around Semi-buried Pipelines due to Steady Currents

عنوان ژورنال:

اشتراک گذاری