نتایج جستجو برای: بستر cuda

تعداد نتایج: 19735  

Journal: :Int. J. Computational Intelligence Systems 2014
R. T. Kneusel

Curve fitting is a fundamental task in many research fields. In this paper we present results demonstrating the fitting of 2D images using CUDA (compute unified device architecture) on NVIDIA graphics processors via particle swarm optimization (PSO). Particle swarm optimization is particularly well-suited to implementation on graphics processors using CUDA as each CUDA thread can be made to mod...

Journal: :J. Parallel Distrib. Comput. 2012
Jiri Barnat Petr Bauch Lubos Brim Milan Ceska

Recent technological developments made various many-core hardware platforms widely accessible. These massively parallel architectures have been used to significantly accelerate many computation demanding tasks. In this paper we show how the algorithms for LTL model checking can be redesigned in order to accelerate LTL model checking on many-core GPU platforms. Our detailed experimental evaluati...

Journal: :CoRR 2015
Hongyu Meng Fangjin Guo

With the development of computing technology, CUDA has become a very important tool. In computer programming, sorting algorithm is widely used. There are many simple sorting algorithms such as enumeration sort, bubble sort and merge sort. In this paper, we test some simple sorting algorithm based on CUDA and draw some useful conclusions.

2013
Martin Köhler

GPU Computing: CUDA vs. OpenCL Currently there is no clear standard for the programming model in applications involving graphics processing units (GPUs). Nvidia as one of the most important hardware manufacturers is pushing their C language extension CUDA, while AMD/ATI as their competitor is following the general OpenCL framework that in principle allows to be applied for arbitrary accelerator...

Journal: :CoRR 2014
Ke Ding Ying Tan

Benchmarking is key for developing and comparing optimization algorithms. In this paper, a CUDA-based real parameter optimization benchmark (cuROB) is introduced. Test functions of diverse properties are included within cuROB and implemented efficiently with CUDA. Speedup of one order of magnitude can be achieved in comparison with CPU-based benchmark of CEC’14.

Journal: :CoRR 2013
Pushan Majumdar

OpenACC compilers allow one to use Graphics Processing Units without having to write explicit CUDA codes. Programs can be modified incrementally using OpenMP like directives which causes the compiler to generate CUDA kernels to be run on the GPUs. In this article we look at the performance gain in lattice simulations with dynamical fermions using OpenACC compilers.

2014
Florentino Sainz Sergi Mateo Vicenç Beltran Jose L. Bosque Eduard Ayguadé

CUDA and OpenCL are the most widely used programming models to exploit hardware accelerators. Both programming models provide a C-based programming language to write accelerator kernels and a host API used to glue the host and kernel parts. Although this model is a clear improvement over a low-level and ad-hoc programming model for each hardware accelerator, it is still too complex and cumberso...

Journal: :Concurrency and Computation: Practice and Experience 2014
Daisuke Takafuji Koji Nakano Yasuaki Ito

We present a time-optimal implementation for bulk execution of an oblivious sequential algorithm. Our second contribution is to develop a tool, named C2CU, which automatically generates a CUDA C program for a bulk execution of an oblivious sequential algorithm. C2CU: A CUDA C Program Generator for Bulk Execution

Journal: :CoRR 2010
Ben D. Lund Justin W. Smith

We present a new implementation of the Floyd-Warshall AllPairs Shortest Paths algorithm on CUDA. Our algorithm runs approximately 5 times faster than the previously best reported algorithm. In order to achieve this speedup, we applied a new technique to reduce usage of on-chip shared memory and allow the CUDA scheduler to more effectively hide instruction latency.

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید