Combining high productivity and high performance in image processing using Single Assignment C on multi-core CPUs and many-core GPUs
نویسندگان
چکیده
In this paper the challenge of parallelization development of industrial high performance inspection systems is addressed concerning a conventional parallelization approach versus an auto-parallelized technique. Therefore, we introduce the functional array processing language Single Assignment C (SaC), which relies on a hardware virtualization concept for automated, parallel machine code generation for multicore CPUs and GPUs. Additional, software engineering aspects like programmability, productivity, understandability, maintainability and resulting achieved gain in performance are discussed from the point of view of a developer. With several illustrative benchmarking examples from the field of image processing and machine learning, the relationship between runtime performance and efficiency of development is analyzed. Further author information: 1Corresponding author, Volkmar Wieser: E-mail: [email protected], Telephone: +43 7236 3343 844 Clemens Grelck: E-mail: [email protected], Telephone: +31 20 525 8683 Peter Haslinger: E-mail: [email protected], Telephone: +43 7236 3343 834 Jing Guo: E-mail: [email protected], Telephone: +44 1707 28 3360 Filip Korzeniowski: E-mail: [email protected], Telephone: +43 7236 3343 838 Robert Bernecky: E-mail: [email protected], Telephone: +1 416 203 0854 Bernhard Moser: E-mail: [email protected], Telephone: +43 7236 3343 833 Sven-Bodo Scholz: E-mail: [email protected], Telephone: +44 131 451 3814
منابع مشابه
Efficient parallelization of the genetic algorithm solution of traveling salesman problem on multi-core and many-core systems
Efficient parallelization of genetic algorithms (GAs) on state-of-the-art multi-threading or many-threading platforms is a challenge due to the difficulty of schedulation of hardware resources regarding the concurrency of threads. In this paper, for resolving the problem, a novel method is proposed, which parallelizes the GA by designing three concurrent kernels, each of which running some depe...
متن کاملMixing Multi-Core CPUs and GPUs for Scientific Simulation Software
Recent technological and economic developments have led to widespread availability of multi-core CPUs and specialist accelerator processors such as graphical processing units (GPUs). The accelerated computational performance possible from these devices can be very high for some applications paradigms. Software languages and systems such as NVIDIA’s CUDA and Khronos consortium’s open compute lan...
متن کاملApplication performance analysis and efficient execution on systems with multi-core CPUs, GPUs and MICs: a case study with microscopy image analysis
We carry out a comparative performance study of multi-core CPUs, GPUs and Intel Xeon Phi (Many Integrated Core-MIC) with a microscopy image analysis application. We experimentally evaluate the performance of computing devices on core operations of the application. We correlate the observed performance with the characteristics of computing devices and data access patterns, computation complexiti...
متن کاملPerformance Evaluation and Analysis for Conjugate Gradient Solver on Heterogeneous (Multi-GPUs/Multi-CPUs) platforms
High performance computing (HPC) presents a technology that allows solving high intensive problems in a reasonable period of time, and can offer many advantages for large applications in various fields of science and industry. Current multi-core processors, especially graphic processing units (GPUs), have quickly evolved to become efficient accelerators for data parallel computing. They can mai...
متن کاملViennaCL - A High Level Linear Algebra Library for GPUs and Multi-Core CPUs
The vast computing resources in graphics processing units (GPUs) have become very attractive for general purpose scientific computing over the past years. Moreover, central processing units (CPUs) consist of an increasing number of individual cores. Most applications today still make use of a single core only, because standard data types and algorithms in wide-spread procedural languages such a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Electronic Imaging
دوره 21 شماره
صفحات -
تاریخ انتشار 2012