VLIW-Based FPGA Computation Fabric with Streaming Memory Hierarchy for Medical Imaging Applications

نویسندگان

  • Joost Hoozemans
  • Rolf Heij
  • Jeroen van Straten
  • Zaid Al-Ars
چکیده

In this paper, we present and evaluate an FPGA acceleration fabric that uses VLIW softcores as processing elements, combined with a memory hierarchy that is designed to stream data between intermediate stages of an image processing pipeline. These pipelines are commonplace in medical applications such as X-ray imagers. By using a streaming memory hierarchy, performance is increased by a factor that depends on the number of stages (7.5× when using 4 consecutive filters). Using a Xilinx VC707 board, we are able to place up to 75 cores. A platform of 64 cores can be routed at 193MHz, achieving real-time performance, while keeping 20% resources available for off-board interfacing. Our VHDL implementation and associated tools (compiler, simulator, etc.) are available for download for the academic community.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FPGA Implementation of a Hammerstein Based Digital Predistorter for Linearizing RF Power Amplifiers with Memory Effects

Power amplifiers (PAs) are inherently nonlinear elements and digital predistortion is a highly cost-effective approach to linearize them. Although most existing architectures assume that the PA has a memoryless nonlinearity, memory effects of the PAs in many applications ,such as wideband code-division multiple access (WCDMA) or orthogonal frequency-division multiplexing (OFDM), can no longer b...

متن کامل

CoRAM: An In-Fabric Memory Abstraction for FPGA-Based Computing

FPGAs have been used in many applications to achieve orders-of-magnitude improvement in absolute performance and energy efficiency relative to conventional microprocessors. Despite their newfound potency in both processing performance and energy efficiency, FPGAs have not gained widespread acceptance as mainstream computing devices. A fundamental obstacle to FPGA-based computing can be traced t...

متن کامل

An Efficient LUT Design on FPGA for Memory-Based Multiplication

An efficient Lookup Table (LUT) design for memory-based multiplier is proposed.  This multiplier can be preferred in DSP computation where one of the inputs, which is filter coefficient to the multiplier, is fixed. In this design, all possible product terms of input multiplicand with the fixed coefficient are stored directly in memory. In contrast to an earlier proposition Odd Multiple Storage ...

متن کامل

A bandwidth-efficient architecture for a streaming media processor

Media processing applications, such as three-dimensional graphics, video compression, and image processing, currently demand 10-100 billion operations per second of sustained computation. Fortunately, hundreds of arithmetic units can easily fit on a modestly sized 1cm2 chip in modern VLSI. The challenge is to provide these arithmetic units with enough data to enable them to meet the computation...

متن کامل

Modular VLIW processor based on FPGA

This paper describes research result about enabling the VLIW processor model for real-time processing applications by exploiting FPGA technology. Our goals are to keep the flexibility of processors in order to shorten the development cycle, and to use the powerful FPGA resources in order to increase real-time performance. We present a modular VLIW VHDL processor model with a variable instructio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017