بستر cuda

نتایج جستجو برای: بستر cuda

تعداد نتایج: 19735 فیلتر نتایج به سال:

Efficient Quicksort and 2D Convex Hull for CUDA, and MSIMD as a Realistic Model of Massively Parallel Computations

2011

Tomasz Jurkiewicz Piotr Danilewski

In recent years CUDA has become a major architecture for multithreaded computations. Unfortunately, its potential is not yet being commonly utilized because many fundamental problems have no practical solutions for such machines. Our goal is to establish a hybrid multicore/parallel theoretical model that represents well architectures like NVIDIA CUDA, Intel Larabee, and OpenCL as well as admits...

متن کامل

Automatic CUDA Code Synthesis Framework for Multicore CPU and GPU Architectures

2011

Hanwoong Jung Youngmin Yi Soonhoi Ha

Recently, general purpose GPU (GPGPU) programming has spread rapidly after CUDA was first introduced to write parallel programs in high-level languages for NVIDIA GPUs. While a GPU exploits data parallelism very effectively, task-level parallelism is exploited as a multi-threaded program on a multicore CPU. For such a heterogeneous platform that consists of a multicore CPU and GPU, in this pape...

متن کامل

Barra, a Parallel Functional GPGPU Simulator

2009

Sylvain Collange David Defour David Parello

We present a GPU functional simulator targeting GPGPU based on the UNISIM framework which takes unaltered NVIDIA CUDA executables as input. It simulates the native instruction set of the Tesla architecture at the functional level and generates detailed execution statistics. Simulation speed is competitive with the less-accurate CUDA emulation mode thanks to optimizations which exploit the inher...

متن کامل

CuPy: A NumPy-Compatible Library for NVIDIA GPU Calculations

2017

Ryosuke Okuta Yuya Unno Daisuke Nishino Shohei Hido Crissman

CuPy 1 is an open-source library with NumPy syntax that increases speed by doing matrix operations on NVIDIA GPUs. It is accelerated with the CUDA platform from NVIDIA and also uses CUDA-related libraries, including cuBLAS, cuDNN, cuRAND, cuSOLVER, cuSPARSE, and NCCL, to make full use of the GPU architecture. CuPy’s interface is highly compatible with NumPy; in most cases it can be used as a dr...

متن کامل

Implicit Incompressible SPH on the GPU

2015

Prashant Goswami André Eliasson Pontus Franzén

This paper presents CUDA-based parallelization of implicit incompressible SPH (IISPH) on the GPU. Along with the detailed exposition of our implementation, we analyze various components involved for their costs. We show that our CUDA version achieves near linear scaling with the number of particles and is faster than the multi-core parallelized IISPH on the CPU. We also present a basic comparis...

متن کامل

Connected Component Labeling in CUDA

2011

Ondřej Št́ava

Connected component labeling (CCL) is a task of detecting connected regions in input data, and it finds its applications in pattern recognition, computer vision, and image processing. We present a new algorithm for connected component labeling in 2-D images implemented in CUDA. We first provide a brief overview of the CCL problem together with existing CPU-oriented algorithms. The rest of the c...

متن کامل

GPU-accelerated solutions to forward problem of TMS

Journal: :Brain Stimulation 2023

Abstract A forward model characterizes the relationship between source and a measurement in computational form. In TMS, such models are used for planning targeting stimulation. The contribution of head volume conductor to is typically solved using BEM or FEM. Here we consider context real-time TMS navigation performance benefits C++ GPU (Cuda) code over MATLAB. MATLAB highly efficient matrix co...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید

Efficient Quicksort and 2D Convex Hull for CUDA, and MSIMD as a Realistic Model of Massively Parallel Computations

Automatic CUDA Code Synthesis Framework for Multicore CPU and GPU Architectures

Barra, a Parallel Functional GPGPU Simulator

CuPy: A NumPy-Compatible Library for NVIDIA GPU Calculations

Implicit Incompressible SPH on the GPU

Connected Component Labeling in CUDA

GPU-accelerated solutions to forward problem of TMS

CURD: a dynamic CUDA race detector

Accelerated MD Program Using CUDA Technology

Parallelized Seeded Region Growing Using CUDA