gpu

An Effective Model of CPU/GPU Collaborative Computing in GPU Clusters

2014

Yue Gu Jian-Hua Gu Xing-She Zhou

Remote procedure call (RPC) is a simple, transparent and useful paradigm for providing communication between two processes across a network. The compute unified device architecture (CUDA) programming toolkit and runtime enhance the programmability of the graphics processing unit (GPU) and make GPU more versatile in high performance computing. The current researches mainly focus on the accelerat...

متن کامل

gScale: Scaling up GPU Virtualization with Dynamic Sharing of Graphics Memory Space

2016

Mochi Xue Kun Tian Yaozu Dong Jiacheng Ma Jiajun Wang Zhengwei Qi Bingsheng He Haibing Guan

With increasing GPU-intensive workloads deployed on cloud, the cloud service providers are seeking for practical and efficient GPU virtualization solutions. However, the cutting-edge GPU virtualization techniques such as gVirt still suffer from the restriction of scalability, which constrains the number of guest virtual GPU instances. This paper introduces gScale, a scalable GPU virtualization ...

متن کامل

GPU-SAM: Leveraging multi-GPU split-and-merge execution for system-wide real-time support

Journal: :Journal of Systems and Software 2016

Wookhyun Han Hoon Sung Chwa Hwidong Bae Hyosu Kim Insik Shin

Multi-GPUs appear as an attractive platform to speed up data-parallel GPGPU computation. The idea of split-and-merge execution has been introduced to accelerate the parallelism of multiple GPUs even further. However, it has not been explored before how to exploit such an idea for real-time multi-GPU systems properly. This paper presents an open-source real-time multi-GPU scheduling framework, c...

متن کامل

Efficient Resource Sharing Through GPU Virtualization on Accelerated High Performance Computing Systems

Journal: :CoRR 2013

Teng Li Vikram K. Narayana Tarek A. El-Ghazawi

The High Performance Computing (HPC) field is witnessing a widespread adoption of Graphics Processing Units (GPUs) as co-processors for conventional homogeneous clusters. The adoption of prevalent SingleProgram Multiple-Data (SPMD) programming paradigm for GPU-based parallel processing brings in the challenge of resource underutilization, with the asymmetrical processor/co-processor distributio...

متن کامل

GPU-Based Volume Segmentation

2005

Stefan Schenke Burkhard C. Wünsche Joachim Denzler

Volume segmentation is an important part of any medical image analysis framework used for diagnoses, treatment planning and biomedical modelling and visualisation. Recent advances in modern graphics hardware have made it possible to perform general purpose computing on the GPU. In this paper we survey and analyse the current state-of-the-art of GPU-based volume segmentation algorithms. Limitati...

متن کامل

Operating Systems Challenges for GPU Resource Management

2011

Shinpei Kato Scott Brandt Yutaka Ishikawa

The graphics processing unit (GPU) is becoming a very powerful platform to accelerate graphics and data-parallel compute-intensive applications. It significantly outperforms traditional multi-core processors in performance and energy efficiency. Its application domains also range widely from embedded systems to high-performance computing systems. However, operating systems support is not adequa...

متن کامل

PURDUE UNIVERSITY GRADUATE SCHOOL Thesis / Dissertation Acceptance

2008

Stephen W. Abell John Jaehwan Lee

Abell, Stephen W. MSECE, Purdue University, August 2013. Parallel Acceleration of Deadlock Detection and Avoidance Algorithms on GPUs. Major Professor: Dr. John Jaehwan Lee. Current mainstream computing systems have become increasingly complex. Most of which have Central Processing Units (CPUs) that invoke multiple threads for their computing tasks. The growing issue with these systems is resou...

متن کامل

GPU-Vote: A Framework for Accelerating Voting Algorithms on GPU

2012

Gert-Jan van den Braak Cedric Nugteren Bart Mesman Henk Corporaal

Voting algorithms, such as histogram and Hough transforms, are frequently used algorithms in various domains, such as statistics and image processing. Algorithms in these domains may be accelerated using GPUs. Implementing voting algorithms efficiently on a GPU however is far from trivial due to irregularities and unpredictable memory accesses. Existing GPU implementations therefore target only...

متن کامل

Dynamic Kernel/Device Mapping Strategies for GPU-Assisted HPC Systems

2012

Jiadong Wu Weiming Shi Bo Hong

With their high computation throughput and outstanding performance-per-watt figures, the graphics processing units (GPU) are becoming increasingly important for high-performance computing (HPC) systems. Existing GPU execution environment restricts the GPU usage to local host node. This is suitable for standalone computer nodes, but becomes inefficient for HPC systems that consist of a large num...

متن کامل

General-purpose molecular dynamics simulations on GPU-based clusters

2010

Christian R. Trott Lars Winterfeld Paul S. Crozier

We present a GPU implementation of LAMMPS, a widely-used parallel molecular dynamics (MD) software package, and show 5x to 13x single node speedups versus the CPU-only version of LAMMPS. This new CUDA package for LAMMPS also enables multi-GPU simulation on hybrid heterogeneous clusters, using MPI for inter-node communication, CUDA kernels on the GPU for all methods working with particle data, a...

متن کامل