OpenCL Task Partitioning in the Presence of GPU Contention

نویسندگان

  • Dominik Grewe
  • Zheng Wang
  • Michael F. P. O'Boyle
چکیده

Heterogeneous multiand many-core systems are increasingly prevalent in the desktop and mobile domains. On these systems it is common for programs to compete with co-running programs for resources. While multi-task scheduling for CPUs is a well-studied area, how to partitioning and map computing tasks onto the hetergeneous system in the presence of GPU contention (i.e. multiple programs compete for the GPU) remains an outstanding problem. In this paper we consider the problem of partitioning OpenCL kernels on a CPU-GPU based system in the presence of contention on the GPU. We propose a machine learning-based approach that predicts the optimal partitioning of OpenCL kernels, explicitly taking GPU contention into account. Our predictive model achieves a speed-up of 1.92 over a scheme that always uses the GPU. When compared to two state-of-theart dynamic approaches our model achieves speed-ups of 1.54 and 2.56

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Static Task Partitioning Approach for Heterogeneous Systems Using OpenCL

Heterogeneous multi-core platforms are increasingly prevalent due to their perceived superior performance over homogeneous systems. The best performance, however, can only be achieved if tasks are accurately mapped to the right processors. OpenCL programs can be partitioned to take advantage of all the available processors in a system. However, finding the best partitioning for any heterogeneou...

متن کامل

Scalability and Parallel Execution of OmpSs-OpenCL Tasks on Heterogeneous CPU-GPU Environment

With heterogeneous computing becoming mainstream, researchers and software vendors have been trying to exploit the best of the underlying architectures like GPUs or CPUs to enhance performance. Parallel programming models play a crucial role in achieving this enhancement. One such model is OpenCL, a parallel computing API for cross platform computations targeting heterogeneous architectures. Ho...

متن کامل

TREES: A CPU/GPU Task-Parallel Runtime with Explicit Epoch Synchronization

—We have developed a task-parallel runtime system, called TREES, that is designed for high performance on CPU/GPU platforms. On platforms with multiple CPUs, Cilk's " work-first " principle underlies how task-parallel applications can achieve performance, but work-first is a poor fit for GPUs. We build upon work-first to create the " work-together " principle that addresses the specific strengt...

متن کامل

GPGPU Computing

Since the first idea of using GPU to general purpose computing, things have evolved over the years and now there are several approaches to GPU programming. GPU computing practically began with the introduction of CUDA (Compute Unified Device Architecture) by NVIDIA and Stream by AMD. These are APIs designed by the GPU vendors to be used together with the hardware that they provide. A new emergi...

متن کامل

Contention-Aware Scheduling of Parallel Code for Heterogeneous Systems

A typical consumer desktop computer has a multi-core CPU with at least two and up to eight processing elements over two processors, and a multi-core GPU with up to 512 processing elements. Both the CPU and the GPU are capable of running parallel code, yet it is not obvious when to utilize one processor or the other because of workload considerations and, as importantly, contention on each devic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013