نتایج جستجو برای: بستر cuda

تعداد نتایج: 19735  

2011
Tomasz Jurkiewicz Piotr Danilewski

In recent years CUDA has become a major architecture for multithreaded computations. Unfortunately, its potential is not yet being commonly utilized because many fundamental problems have no practical solutions for such machines. Our goal is to establish a hybrid multicore/parallel theoretical model that represents well architectures like NVIDIA CUDA, Intel Larabee, and OpenCL as well as admits...

2011
Hanwoong Jung Youngmin Yi Soonhoi Ha

Recently, general purpose GPU (GPGPU) programming has spread rapidly after CUDA was first introduced to write parallel programs in high-level languages for NVIDIA GPUs. While a GPU exploits data parallelism very effectively, task-level parallelism is exploited as a multi-threaded program on a multicore CPU. For such a heterogeneous platform that consists of a multicore CPU and GPU, in this pape...

2009
Sylvain Collange David Defour David Parello

We present a GPU functional simulator targeting GPGPU based on the UNISIM framework which takes unaltered NVIDIA CUDA executables as input. It simulates the native instruction set of the Tesla architecture at the functional level and generates detailed execution statistics. Simulation speed is competitive with the less-accurate CUDA emulation mode thanks to optimizations which exploit the inher...

2017
Ryosuke Okuta Yuya Unno Daisuke Nishino Shohei Hido Crissman

CuPy 1 is an open-source library with NumPy syntax that increases speed by doing matrix operations on NVIDIA GPUs. It is accelerated with the CUDA platform from NVIDIA and also uses CUDA-related libraries, including cuBLAS, cuDNN, cuRAND, cuSOLVER, cuSPARSE, and NCCL, to make full use of the GPU architecture. CuPy’s interface is highly compatible with NumPy; in most cases it can be used as a dr...

2015
Prashant Goswami André Eliasson Pontus Franzén

This paper presents CUDA-based parallelization of implicit incompressible SPH (IISPH) on the GPU. Along with the detailed exposition of our implementation, we analyze various components involved for their costs. We show that our CUDA version achieves near linear scaling with the number of particles and is faster than the multi-core parallelized IISPH on the CPU. We also present a basic comparis...

2011
Ondřej Št́ava

Connected component labeling (CCL) is a task of detecting connected regions in input data, and it finds its applications in pattern recognition, computer vision, and image processing. We present a new algorithm for connected component labeling in 2-D images implemented in CUDA. We first provide a brief overview of the CCL problem together with existing CPU-oriented algorithms. The rest of the c...

Journal: :Brain Stimulation 2023

Abstract A forward model characterizes the relationship between source and a measurement in computational form. In TMS, such models are used for planning targeting stimulation. The contribution of head volume conductor to is typically solved using BEM or FEM. Here we consider context real-time TMS navigation performance benefits C++ GPU (Cuda) code over MATLAB. MATLAB highly efficient matrix co...

Journal: :ACM SIGPLAN Notices 2018

Journal: :Communications in Physics 2011

Journal: :Computational and Mathematical Methods in Medicine 2014

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید