Executing Process Networks on Heterogeneous Platforms using OpenCL
نویسندگان
چکیده
Upcoming heterogeneous systems ask for new programming paradigms. Abstracting the underlying hardware architecture is desirable in order to support productive software development. This thesis proposes a design flow and runtime-system for executing process networks on heterogeneous systems using OpenCL. Process networks are a popular model of computation for deterministic parallel programming and OpenCL is a royalty-free standardised programming interface with a broad support in industry. The proposed design flow consists of a program code synthesis framework for building applications from a generic high-level process network specification. The synthesised application is targeted to a certain OpenCL architecture that was predefined by a high-level specification. The target code is built for this architecture specification integrating reusable building block primitives into it. Those primitives are location-based FIFO channels minimising the number of memory copy operations, a process wrapper that is interconnectable to channels and mirrors the process network functionality, and an extensible task activation framework responsible for interprocess synchronisation. Heterogeneous systems, being inherently parallel computer architectures, demand scalable and parallel applications. To simplify this, a notion of shadow copies was introduced to transparently abstract data-parallelism. Extensive evaluations on two heterogeneous systems have shown that the proposed design flow and runtime-system support a wide range of heterogeneous platforms and parallel applications. Furthermore, the evaluations have proved that the proposed framework is suitable for efficient and productive software development for heterogeneous systems and that it provides enough flexibility so that the programmer can efficiently exploit the parallelism offered by multicore CPUs and GPUs.
منابع مشابه
In-place Recursive Approach for All-pairs Shortest Paths Problem Using Opencl
The all-pairs shortest paths (APSP) problem finds the shortest path distances between all pairs of vertices,and is one of the most fundamental graph problems. In this paper, a parallel recursive partitioning approach to APSP problem using Open Computing Language (OpenCL) for directed and dense graphs with no negative cyclesbased on R-Kleene algorithm, is presented, which recursively partitions ...
متن کاملLeveraging Parallelism with CUDA and OpenCL
Graphics processing units (GPUs), originally designed for computing and manipulating pixels, have become general-purpose processors capable of executing in excess of trillion calculations per second. Taking advantage of GPU’s compute power and commodity popularity, the field of computing systems is exhibiting a trend toward heterogeneous platforms consisting of a central processor integrated wi...
متن کاملParallel Computing for Accelerated Texture Classification with Local Binary Pattern Descriptors using OpenCL
In this paper, a novel parallelized implementation of rotation invariant texture classification using Heterogeneous Computing Platforms like CPU and Graphics Processing Unit (GPU) is proposed. A complete modeling of the LBP operator as well as its improvised versions of Complete Local Binary Patterns (CLBP) and Multi-scale Local Binary Patterns (MLBP) has been developed on a CPU and GPU based H...
متن کاملAvailable online at www.prace-ri.eu Partnership for Advanced Computing in Europe GROMACS on Hybrid CPU-GPU and CPU-MIC Clusters: Preliminary Porting Experiences, Results and Next Steps
This report introduces hybrid implementation of the Gromacs application, and provides instructions on building and executing on PRACE prototype platforms with Grahpical Processing Units (GPU) and Many Intergrated Cores (MIC) accelerator technologies. GROMACS currently employs message-passing MPI parallelism, multi-threading using OpenMP and contains kernels for non-bonded interactions that are ...
متن کاملPerformance Portability in Accelerated Parallel Kernels
Heterogeneous architectures, by definition, include multiple processing components with very different microarchitectures and execution models. In particular, computing platforms from supercomputers to smartphones can now incorporate both CPU and GPU processors. Disparities between CPU and GPU processor architectures have naturally led to distinct programming models and development patterns for...
متن کامل