نتایج جستجو برای: بستر cuda
تعداد نتایج: 19735 فیلتر نتایج به سال:
Recent technological developments made available various many-core hardware platforms. For example, a SIMD-like hardware architecture became easily accessible for many users who have their computers equipped with modern NVIDIA GPU cards with CUDA technology. In this paper we redesign the maximal accepting predecessors algorithm [7] for LTL model checking in terms of matrix-vector product in ord...
باتوجه به محبوبیت و استفاده روزافزون از وسایل دیجیتال در زندگی روزمره بشر و همچنین گسترش به اشتراکگذاری تصاویر در شبکههای اجتماعی همچون فیسبوک، فلیکر، اینستاگرام و غیره و همچنین بارگذاری فیلمهای مختلف در این شبکهها، استفاده از تصاویر دیجیتال مخصوصا در دهه اخیر رشد قابل توجهی داشتهاست که در میان این تصاویر، درصد بالایی مربوط به تصاویر چهره انسان است و در مواردی از قبیل پایش تصویر برخط...
In this note, we present the stability as well as performance analysis of asynchronous parallel computing algorithm implemented in 1D heat equation with CUDA. The primary objective of this note lies in dissemination of asynchronous parallel computing algorithm by providing CUDA code for fast and easy implementation. We show that the simulations carried out on nVIDIA GPU device with asynchronous...
Parallel prefix scan, also known as parallel prefix sum, is a building block for many parallel algorithms including polynomial evaluation, sorting and building data structures. This paper introduces prefix scan and also describes a step-bystep procedure to implement prefix scan efficiently with Compute Unified Device Architecture (CUDA). This paper starts with a basic naive algorithm and procee...
The ideal choices for the tasks presented in the project proposal would be the NVIDIA CUDA toolkit (http://developer.nvidia.com/object/cuda. html), which exposes more underlying architecture to programmers. However, the package requires a capable NVIDIA video card, and we could not get for this project. ATI also designed a similar platform “Close-to-Metal (CTM) Device” (http://ati.de/companyinf...
ions Skeletons and Composition : Tomorrow 4:30pm OpenGPU workshop DSL Embedded language to express kernel Real World Use Case 2DRMP : Dimensional R-matrix propagation (Computer Physics Communications) Simulates electron scattering from H-like atoms and ions at intermediate energies Multi-Architecture: MultiCore, GPGPU, Clusters, GPU Clusters Translate from Fortran + Cuda to OCaml+SPOC + Cuda/Op...
High-performance streams of (pseudo) random numbers are crucial for the efficient implementation for countless stochastic algorithms, most importantly, Monte Carlo simulations and molecular dynamics simulations with stochastic thermostats. A number of implementations of random number generators has been discussed for GPU platforms before and some generators are even included in the CUDA support...
A parallel belief propagation algorithm based on Particle Filtering (PF) for channel estimation and Low-Density Parity-Check (LDPC) decoding is presented in this paper based on Compute Unified Device Architecture (CUDA). The authors have found that compared with the traditional Belief Propagation (BP) algorithm with fixed estimated noise power, BP algorithm based on PF [1] not only gives a good...
A MPI-friendly density functional theory (DFT) source code was modified within hybrid parallelization including CUDA. The objective is to find out how simple conversions within the hybrid parallelization with mid-range GPUs affect DFT code not originally suitable to CUDA. Several rules of hybrid parallelization for numerical-atomic-orbital (NAO) DFT codes were settled. The test was performed on...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید