Executing PRAM Programs on GPUs

نویسندگان

  • Jurgen Brenner
  • Jörg Keller
  • Christoph W. Kessler
چکیده

We present a framework to transform PRAM programs from the PRAM programming language Fork to CUDA C, so that they can be compiled and executed on a Graphics Processor (GPU). This allows to explore parallel algorithmics on a scale beyond toy problems, to which the previous, sequential PRAM simulator restricted practical use. We explain the design decisions and evaluate a prototype implementation consisting of a runtime library and a set of rules to transform simple Fork programs which we for now apply by hand. The resulting CUDA code is almost 100 times faster than the previous simulator for compiled Fork programs and allows to handle larger data sizes. Compared to a sequential program for the same problem, the GPU code might be faster or slower, depending on the Fork program structure, i.e. on the overhead incurred. We also give an outlook how future GPUs might notably reduce the overhead.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Programming Data-parallel { Executing Process-parallel

Most theoretical work is based on the PRAM-model which has a block of shared memory and executes in a synchronous lock-step mode. Real hardware usually executes asynchronously and uses local memory and message passing. The recent LogP-model reeects these architectural properties. We show that for a practically important subclass of PRAM-programs it is possible to transform them into LogP-progra...

متن کامل

ÆminiumGPU: An Intelligent Framework for GPU Programming

As a consequence of the immense computational power available in GPUs, the usage of these platforms for running data-intensive general purpose programs has been increasing. Since memory and processor architectures of CPUs and GPUs are substantially different, programs designed for each platform are also very different and often resort to a very distinct set of algorithms and data structures. Se...

متن کامل

Solving the Maxwell-Bloch equations for resonant nonlinear optics using GPUs

We solve the Maxwell-Bloch equations of resonant nonlinear optics using GPUs and compare the computation times with traditional singleand multithreaded programs. A detailed benchmarking of programs as a function of various parameters shows how the massive parallelism built into GPUs becomes more and more advantageous as the physical problem becomes more and more demanding. For the case of multi...

متن کامل

In-place Recursive Approach for All-pairs Shortest Paths Problem Using Opencl

The all-pairs shortest paths (APSP) problem finds the shortest path distances between all pairs of vertices,and is one of the most fundamental graph problems. In this paper, a parallel recursive partitioning approach to APSP problem using Open Computing Language (OpenCL) for directed and dense graphs with no negative cyclesbased on R-Kleene algorithm, is presented, which recursively partitions ...

متن کامل

OpenCL-based optimizations for acceleration of object tracking on FPGAs and GPUs

OpenCL support across many heterogeneous nodes (FPGAs, GPUs, CPUs) has increased the programmability of these systems significantly. At the same time, it opens up new challenges and design choices for system designers and application programmers. While OpenCL offers a universal semantic to capture the parallel behavior of applications independent of the target architecture, some customization s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012