Die Photos ( 3 classes of cores ) A 45 nm 1 . 3 GHz 16 . 7 Double - Precision GFLOPS / W RISC - V Processor with Vector Accelerator

نویسندگان

  • Torsten Hoefler
  • Markus Püschel
  • Salvatore Di Girolamo
چکیده

A 64-bit dual-core RISC-V processor with vector accelerators has been fabr icated in a 45nm SOI process. This is the first dual-core processor to implement the open-source RISC-V ISA designed at the University of California, Berkeley. In a standard 40nm process, the RISC-V scalar core scores 10% higher in DMIPS/MHz than the Cortex-A5, ARM’s comparable single-issue in-order scalar core, and is 49% more area-efficient. To demonstrate the extensibility of the RISC-V ISA, we integrate a custom vector accelerator alongside each single-issue in-order scalar core. The vector accelerator is 1.8⇥ more energy-efficient than the IBM Blue Gene/Q processor, and 2.6⇥ more than the IBM Cell processor, both fabr icated in the same process. The dual-core RISC-V processor achieves maximum clock frequency of 1.3GHz at 1.2V and peak energy efficiency of 16.7 doubleprecision GFLOPS/W at 0.65V with an area of 3mm .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Highly Efficient Multicore Floating-Point FFT Architecture Based on Hybrid Linear Algebra/FFT Cores

FFT algorithms have memory access patterns that prevent many architectures from achieving high computational utilization, particularly when parallel processing is required to achieve the desired levels of performance. Starting with a highly efficient hybrid linear algebra/FFT core, we co-design the on-chip memory hierarchy, on-chip interconnect, and FFT algorithms for a multicore FFT processor....

متن کامل

Implementing 3D Jacobi Method in Cell

IBM Cell Broadband Engine (CBE) is an interesting architecture that provides amazing performance for floating point (especially single precision) computation intensive applications. Cell also provides very impressive gFlops for double precision operations. 3D Jacobi method is a heavily utilized method in scientific computations. In this project, we implement the parallel version of 3D Jacobi me...

متن کامل

Threaded MPI programming model for the Epiphany RISC array processor

The low-power Adapteva Epiphany RISC array processor offers high computational energy-efficiency and parallel scalability. However, extracting performance with a standard parallel programming model remains a great challenge. We present an effective programming model for the Epiphany architecture basedon theMessagePassing Interface (MPI) standardadapted for coprocessoroffload.UsingMPIexploits th...

متن کامل

A Buffered-Mode MPI Implementation for the Cell BETM Processor

The Cell Broadband EngineTM is a heterogeneous multi-core architecture developed by IBM, Sony and Toshiba. It has eight computation intensive cores (SPEs) with a small local memory, and a single PowerPC core. The SPEs have a total peak single precision performance of 204.8 Gflops/s, and 14.64 Gflops/s in double precision. Therefore, the Cell has a good potential for high performance computing. ...

متن کامل

FPGA accelerator for floating-point matrix multiplication

This study treats architecture and implementation of a FPGA accelerator for double-precision floating-point matrix multiplication. The architecture is oriented towards minimising resource utilisation and maximising clock frequency. It employs the block matrix multiplication algorithm which returns the result blocks to the host processor as soon as they are computed. This avoids output buffering...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017