Multi-GPU Load Balancing for In-Situ Simulation and Visualization
نویسندگان
چکیده
Multiple-GPU systems have become ubiquitously available due to their support of massive parallel computing and more device memory for large scale problems. Such systems are ideal for In-Situ visualization applications, which require significant computational power for concurrent execution of simulation and visualization. While pipelining based parallel computing scheme overlaps the execution of simulation and rendering among multiple GPUs, workload imbalance can cause substantial performance loss in such parallel configuration. The aim of this paper is to research on the memory management and scheduling issues in the multi-GPU environment, in order to balance the workload between this two-stage pipeline execution. We first propose a data-driven load balancing scheme which takes into account of some important performance factors for scientific simulation and rendering, such as the number of iterations for the simulation and the rendering resolution. As an improvement to this scheduling method, we also introduce a dynamic load balancing approach that can automatically adjust the workload changes at runtime to achieve better load balancing results. This approach is based on an idea to analytically approximate the execution time difference between the simulation and the rendering by using fullness of the synchronization data buffer. We have evaluated our approaches on an eight-GPU system and showed significant performance improvement.
منابع مشابه
Multi-GPU Load Balancing for In-situ Visualization
Real-time visualization is an important tool for immediately inspecting results for scientific simulations. Graphics Processing Units (GPUs) as commodity computing devices offer massive parallelism that can greatly improve performance for data-parallel applications. However, a single GPU provides limited support which is only suitable for smaller scale simulations. Multi-GPU computing, on the o...
متن کاملLoad-Balanced Multi-GPU Ambient Occlusion for Direct Volume Rendering
Ambient occlusion techniques were introduced to improve data comprehension by bringing soft fading shadows to the visualization of 3D datasets. They consist in attenuating light by considering the occlusion resulting from the presence of neighboring structures. Nevertheless they often come with an important precomputation cost, which prevents their use in interactive applications based on trans...
متن کاملA flexible Patch-based lattice Boltzmann parallelization approach for heterogeneous GPU-CPU clusters
Sustaining a large fraction of single GPU performance in parallel computations is considered to be the major problem of GPU-based clusters. In this article, this topic is addressed in the context of a lattice Boltzmann flow solver that is integrated in the WaLBerla software framework. We propose a multi-GPU implementation using a block-structured MPI parallelization, suitable for load balancing...
متن کاملA journey from single-GPU to optimized multi-GPU SPH with CUDA
We present an optimized multi-GPU version of GPUSPH, a CUDA implementation of fluid-dynamics models based on the Smoothed Particle Hydrodynamics (SPH) numerical method. SPH is a well-known Lagrangian model for the simulation of free-surface fluid flows; it exposes a high degree of parallelism and has already been successfully ported to GPU. We extend the GPU-based simulator to exploit multiple ...
متن کاملInteractive High-Quality Visualization of Higher-Order Finite Elements
Higher-order finite element methods have emerged as an important discretization scheme for simulation. They are increasingly used in contemporary numerical solvers, generating a new class of data that must be analyzed by scientists and engineers. Currently available visualization tools for this type of data are either batch oriented or limited to certain cell types and polynomial degrees. Other...
متن کامل