نتایج جستجو برای: بستر cuda

تعداد نتایج: 19735  

Journal: :Simulation Modelling Practice and Theory 2012
Pablo R. Rinaldi E. A. Dari Marcelo J. Vénere Alejandro Clausse

A three-dimensional Lattice-Boltzmann fluid model with nineteen discrete velocities was implemented using NVIDIA Graphic Processing Unit (GPU) programing language ‘‘Compute Unified Device Architecture’’ (CUDA). Previous LBM GPU implementations required two steps to maximize memory bandwidth due to memory access restrictions of earlier versions of CUDA toolkit and hardware capabilities. In this ...

2012
Shi Jing Qi Huang Jianbo Yi

In this paper, design and implementation of a new parallel computing method based on CUDA (Compute Unified Device Architecture) platform is described in detail. The method includes algorithm of matrix fraction and partial LU decomposition that are used to support parallel computing for simulation of the whole scene test in power system. The paper describes all the steps of algorithm implementat...

2008
John A. Stratton Sam S. Stone Wen-mei W. Hwu

Abstract. CUDA is a data parallel programming model that supports several key abstractions thread blocks, hierarchical memory and barrier synchronization for writing applications. This model has proven effective in programming GPUs. In this paper we describe a framework called MCUDA, which allows CUDA programs to be executed efficiently on shared memory, multi-core CPUs. Our framework consists ...

Journal: :CoRR 2014
Jae-Hyeon Parq Erik Sevre Sang-Mook Lee

We modified a MPI-friendly density functional theory (DFT) source code within hybrid parallelization including CUDA. Our objective is to find out how simple conversions within the hybrid parallelization with mid-range GPUs affect DFT code not originally suitable to CUDA. We settled several rules of hybrid parallelization for numerical-atomic-orbital (NAO) DFT codes. The test was performed on a ...

2009
Sara S. Baghsorkhi Matthieu Delahaye William D. Gropp Wen-mei W. Hwu

In this paper we present an analytical model to predict the performance of general purpose applications on a GPU architecture. Themodel is designed to provide performance information to an auto-tuning compiler and assist it narrow the search to the more promising implementations. This work is based on the NVIDIAGPUs using CUDA (ComputeUnified Device Architecture). We analyze each CUDA kernel an...

2015
Jörn Teuber Rene Weller Gabriel Zachmann

We present a novel framework for the simultaneous development for different massively parallel platforms. Currently, our framework supports CUDA and OpenCL but it can be easily adapted to other programming languages. The main idea is to provide an easy-to-use abstraction layer that encapsulates the calls of own parallel device code as well as library functions. With our framework the code has t...

2017
Gábor Dániel Balogh I. Z. Reguly Gihan R. Mudalige

Efficiently exploiting GPUs is increasingly essential in scientific computing, as many current and upcoming supercomputers are built using them. To facilitate this, there are a number of programming approaches, such as CUDA, OpenACC and OpenMP 4, supporting different programming languages (mainly C/C++ and Fortran). There are also several compiler suites (clang, nvcc, PGI, XL) each supporting d...

2009
José M. Cecilia José M. García Manuel Ujaldon

Modern graphics processing units (GPUs) have been at the leading edge of increasing chip-level parallelism over the last ten years, and the CUDA programming model has recently allowed us to exploit its power across many computational domains. Within them, dense linear algebra algorithms emerge like a natural fit for CUDA and the GPU because they are usually inherently parallel and can naturally...

2009
MATTHEW R. NORMAN

Computational fluid dynamics in general require large computational resources. The same is true for an atmospheric model which simulates non-hydrostatic density-stratified flow with a gravity source term. There have been many applications of CUDA to CFD problems as can be seen by the many papers on . In fact, a full-scale global atmospheric model has been parallelized for CUDA. For my graduate ...

2008
S. Ponce J. Huang S. I. Park C. Khoury Y. Cao F. Quek W. Feng

This paper presents a novel parallelization and quantitative characterization of various optimization strategies for dataparallel computation on a graphics processing unit (GPU) using NVIDIA’s new GPU programming framework, Compute Unified Device Architecture (CUDA). CUDA is an easy-to-use development framework that has drawn the attention of many different application areas looking for dramati...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید