A Distributed Multi-GPU System for Fast Graph Processing
نویسندگان
چکیده
We present Lux, a distributed multi-GPU system that achieves fast graph processing by exploiting the aggregate memory bandwidth of multiple GPUs and taking advantage of locality in the memory hierarchy of multi-GPU clusters. Lux provides two execution models that optimize algorithmic efficiency and enable important GPU optimizations, respectively. Lux also uses a novel dynamic load balancing strategy that is cheap and achieves good load balance across GPUs. In addition, we present a performance model that quantitatively predicts the execution times and automatically selects the runtime configurations for Lux applications. Experiments show that Lux achieves up to 20× speedup over state-of-the-art shared memory systems and up to two orders of magnitude speedup over distributed systems. PVLDB Reference Format: Zhihao Jia, Yongkee Kwon, Galen Shipman, Pat McCormick, Mattan Erez, and Alex Aiken. A Distributed Multi-GPU System for Fast Graph Processing. PVLDB, 11(3): xxxx-yyyy, 2017. DOI: 10.14778/3157794.3157799
منابع مشابه
Ultra-Fast Image Reconstruction of Tomosynthesis Mammography Using GPU
Digital Breast Tomosynthesis (DBT) is a technology that creates three dimensional (3D) images of breast tissue. Tomosynthesis mammography detects lesions that are not detectable with other imaging systems. If image reconstruction time is in the order of seconds, we can use Tomosynthesis systems to perform Tomosynthesis-guided Interventional procedures. This research has been designed to study u...
متن کاملTowards Efficient Graph Traversal using a Multi-GPU Cluster
Graph processing has always been a challenge, as there are inherent complexities in it. These include scalability to larger data sets and clusters, dependencies between vertices in the graph, irregular memory accesses during processing and traversals, minimal locality of reference, etc. In literature, there are several implementations for parallel graph processing on single GPU systems but only...
متن کاملAn Approach in Radiation Therapy Treatment Planning: A Fast, GPU-Based Monte Carlo Method
Introduction: An accurate and fast radiation dose calculation is essential for successful radiation radiotherapy. The aim of this study was to implement a new graphic processing unit (GPU) based radiation therapy treatment planning for accurate and fast dose calculation in radiotherapy centers. Materials and Methods: A program was written for parallel runnin...
متن کاملFast Cellular Automata Implementation on Graphic Processor Unit (GPU) for Salt and Pepper Noise Removal
Noise removal operation is commonly applied as pre-processing step before subsequent image processing tasks due to the occurrence of noise during acquisition or transmission process. A common problem in imaging systems by using CMOS or CCD sensors is appearance of the salt and pepper noise. This paper presents Cellular Automata (CA) framework for noise removal of distorted image by the salt an...
متن کاملGLUG: GPU Layout of Undirected Graphs
We present a fast parallel algorithm for layout of undirected graphs, using commodity graphics processing unit (GPU) hardware. The GLUG algorithm creates a force-based layout minimizing the Kamada Kawai energy of the graph embedding. Two parameters control the graph layout: the number of landmarks used in the force simulation determines the influence of the global structure, while the number of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- PVLDB
دوره 11 شماره
صفحات -
تاریخ انتشار 2017