Programming CUDA and OpenCL: A Case Study Using Modern C++ Libraries
نویسندگان
چکیده
We present a comparison of several modern C++ libraries providing high-level interfaces for programming multiand many-core architectures on top of CUDA or OpenCL. The comparison focuses on the solution of ordinary differential equations and is based on odeint, a framework for the solution of systems of ordinary differential equations. Odeint is designed in a very flexible way and may be easily adapted for effective use of libraries such as MTL4, VexCL, or ViennaCL, using CUDA or OpenCL technologies. We found that CUDA and OpenCL work equally well for problems of large sizes, while OpenCL has higher overhead for smaller problems. Furthermore, we show that modern high-level libraries allow to effectively use the computational resources of many-core GPUs or multi-core CPUs without much knowledge of the underlying technologies.
منابع مشابه
Extending the Capabilities of the Cray Programming Environment with Clang-LLVM Framework Integration
Recent developments in programming for multicore processors and accelerators using C++11, OpenCL and Domain Specific Languages (DSL) have prompted us to look into tools that offer compilers and both static and runtime analysis toolchains to complement the Cray Programming Environment capabilities. In this paper we report our preliminary experiences from using the CLang-LLVM framework on a hybri...
متن کاملOpenCL: A pArALLeL prOgrAmming StAndArd
T he strong need for increased computational performance in science and engineering has led to the use of heterogeneous computing, with GPUs and other accelerators acting as coprocessors for arithmetic intensive data-parallel workloads.1–4 OpenCL is a new industry standard for task-parallel and data-parallel heterogeneous computing on a variety of modern CPUs, GPUs, DSPs, and other microprocess...
متن کاملGeometric Algebra enhanced Precompiler for C++ and OpenCL
The focus of the this work is a simplified integration of algorithms expressed in Geometric Algebra (GA) in modern high level computer languages, namely C++, OpenCL and CUDA. A high runtime performance in terms of GA is achieved using symbolic simplification and code generation by a Precompiler that is directly integrated into CMake-based build toolchains.
متن کاملProgramming GPUs with C++14 and Just-In-Time Compilation
Systems that comprise accelerators (e.g., GPUs) promise high performance, but their programming is still a challenge, mainly because of two reasons: 1) two distinct programming models have to be used within an application: one for the host CPU (e.g., C++), and one for the accelerator (e.g., OpenCL or CUDA); 2) using Just-In-Time (JIT) compilation and its optimization opportunities in both OpenC...
متن کاملGPU Acceleration for the C++ Standard Template Library
Modern programmers must exploit parallelism for performance gains, possibly through the use of an attached or on-chip GPU. To take advantage of the GPU in C++ programs, the programmer must use either a new language (CUDA or OpenCL) or an external library (Thrust). Rather than requiring that programmers learn new tools, modify existing code, and change software development practices, the C++ Sta...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- SIAM J. Scientific Computing
دوره 35 شماره
صفحات -
تاریخ انتشار 2013