OpenCL on FPGAs for GPU Programmers

ثبت نشده
چکیده

Data Parallelism and Kernels Data parallelism is a form of parallelism across multiple processors that is achieved when each processor performs identical tasks on different pieces of distributed data. Data-parallel portions of an algorithm are executed on devices as kernels, which are C functions with some restrictions and a few language extensions. The host launches kernels across a 1D, 2D, or 3D grid of work-items to be processed by the devices. Conceptually, work-items can be thought of as individual processing threads, that each execute the same kernel function. Work-items have a unique index within the grid, and typically compute different portions of the result. Work-items are grouped together into work-groups, which are expected to execute independently from one another.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

OpenCL-based optimizations for acceleration of object tracking on FPGAs and GPUs

OpenCL support across many heterogeneous nodes (FPGAs, GPUs, CPUs) has increased the programmability of these systems significantly. At the same time, it opens up new challenges and design choices for system designers and application programmers. While OpenCL offers a universal semantic to capture the parallel behavior of applications independent of the target architecture, some customization s...

متن کامل

Combined Spatial and Temporal Blocking for High-Performance Stencil Computation on FPGAs Using OpenCL

Recent developments in High Level Synthesis tools have attracted software programmers to accelerate their high-performance computing applications on FPGAs. Even though it has been shown that FPGAs can compete with GPUs in terms of performance for stencil computation, most previous work achieve this by avoiding spatial blocking and restricting input dimensions relative to FPGA on-chip memory. In...

متن کامل

High-performance Dynamic Programming on FPGAs with OpenCL

Field programmable gate arrays (FPGAs) provide reconfigurable computing fabrics that can be tailored to a wide range of time and power sensitive applications. Traditionally, programming FPGAs required an expertise in complex hardware description languages (HDLs) or proprietary high-level synthesis (HLS) tools. Recently, Altera released the worlds first OpenCL conformant SDK for FPGAs. OpenCL is...

متن کامل

Loop2GPU: Transforming Loops to OpenCL Kernels as a LLVM Pass

Lately, programmers have started to take advantage of the GPU capabilities of their systems. Still, programming for the GPU can be very hard. We are trying to hide some of this complexity from the programmer by making the compiler automatically transform embarrassingly parallel loops to GPU kernels. To this end, we have implemented a compiler pass that transforms simple loops to OpenCL kernels.

متن کامل

Energy-efficient FPGA Implementation of the k-Nearest Neighbors Algorithm Using OpenCL

Modern SoCs are getting increasingly heterogeneous with a combination of multi-core architectures and hardware accelerators to speed up the execution of computeintensive tasks at considerably lower power consumption. Modern FPGAs, due to their reasonable execution speed and comparatively lower power consumption, are strong competitors to the traditional GPU based accelerators. High-level Synthe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014