Microarchitecture and implementation of the synergistic processor in 65-nm and 90-nm SOI

نویسندگان

  • Brian K. Flachs
  • Shigehiro Asano
  • Sang H. Dhong
  • H. Peter Hofstee
  • Gilles Gervais
  • Roy Kim
  • Tien Le
  • Peichun Liu
  • Jens Leenstra
  • John S. Liberty
  • Brad W. Michael
  • Hwa-Joon Oh
  • Silvia M. Müller
  • Osamu Takahashi
  • Koji Hirairi
  • Atsushi Kawasumi
  • Hiroaki Murakami
  • Hiromi Noro
  • Shoji Onishi
  • Juergen Pille
  • Joel Silberman
  • Suksoon Yong
  • Akiyuki Hatakeyama
  • Yukio Watanabe
  • Naoka Yano
  • Daniel A. Brokenshire
  • Mohammad Peyravian
  • VanDung To
  • Eiji Iwata
چکیده

implementation of the synergistic processor in 65-nm and 90-nm SOI B. Flachs S. Asano S. H. Dhong H. P. Hofstee G. Gervais R. Kim T. Le P. Liu J. Leenstra J. S. Liberty B. Michael H.-J. Oh S. M. Mueller O. Takahashi K. Hirairi A. Kawasumi H. Murakami H. Noro S. Onishi J. Pille J. Silberman S. Yong A. Hatakeyama Y. Watanabe N. Yano D. A. Brokenshire M. Peyravian V. To E. Iwata This paper describes the architecture and implementation of the original gaming-oriented synergistic processor element (SPE) in both 90-nm and 65-nm silicon-on-insulator (SOI) technology and introduces a new SPE implementation targeted for the highperformance computing community. The Cell Broadband Enginee processor contains eight SPEs. The dual-issue, four-way singleinstruction multiple-data processor is designed to achieve high performance per area and power and is optimized to process streaming data, simulate physical phenomena, and render objects digitally. Most aspects of data movement and instruction flow are controlled by software to improve the performance of the memory system and the core performance density. The SPE was designed as an 11-FO4 (fan-out-of-4-inverter-delay) processor using 20.9 million transistors within 14.8 mm using the IBM 90-nm SOI low-k process. CMOS (complementary metal-oxide semiconductor) static gates implement the majority of the logic. Dynamic circuits are used in critical areas and occupy 19% of the non–static random access memory (SRAM) area. Instruction set architecture, microarchitecture, and physical implementation are tightly coupled to achieve a compact and power-efficient design. Correct operation has been observed at up to 5.6 GHz and 7.3 GHz, respectively, in 90-nm and 65-nm SOI technology.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implementation Of Low Power SRAM By Using 8T Decoupled Logic

We present a novel half-select disturb free transistor SRAM cell. The cell is 6T based and utilizes decoupling logic. It employs gated inverter SRAM cells to decouple the column select read disturb scenario in half-selected columns which is one of the impediments to lowering cell voltage. Furthermore, “false read” before write operation, common to conventional 6T designs due to bit-select and w...

متن کامل

Modified 32-Bit Shift-Add Multiplier Design for Low Power Application

Multiplication is a basic operation in any signal processing application. Multiplication is the most important one among the four arithmetic operations like addition, subtraction, and division. Multipliers are usually hardware intensive, and the main parameters of concern are high speed, low cost, and less VLSI area. The propagation time and power consumption in the multiplier are always high. ...

متن کامل

Decoupled Logic Based Design for Implementation Low Power Memories by 8T SRAM

We present a novel half-select disturb free transistor SRAM cell. The cell is 6T based and utilizes decoupling logic. It employs gated inverter SRAM cells to decouple the column select read disturb scenario in half-selected columns which is one of the impediments to lowering cell voltage. Furthermore, “false read” before write operation, common to conventional 6T designs due to bit-select and w...

متن کامل

Estimation of Soft Error Tolerance according to the Thickness of Buried Oxide and Body Bias 28-nm and 65-nm in FD-SOI Processes by a Monte-Carlo Simulation

1. Abstract We estimate the soft error rates of FD-SOI structures according to the thicknesses of BOX(Buird OXide) layers and body bias on 65-nm and 28-nm processes by reducing the supply voltage. A Monte-Carlo based simulation is used in this work. The parasitic bipolar effect is suppressed by thicker BOX on FD-SOI structure.The simulation results are consistent with the alpha and neutron irra...

متن کامل

Fast Cellular Automata Implementation on Graphic Processor Unit (GPU) for Salt and Pepper Noise Removal

Noise removal operation is commonly applied as pre-processing step before subsequent image processing tasks due to the occurrence of noise during acquisition or transmission process. A common problem in imaging systems by using CMOS or CCD sensors is appearance of  the salt and pepper noise. This paper presents Cellular Automata (CA) framework for noise removal of distorted image by the salt an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IBM Journal of Research and Development

دوره 51  شماره 

صفحات  -

تاریخ انتشار 2007