نتایج جستجو برای: بستر cuda

تعداد نتایج: 19735  

2012
Alexandru Pîrjan

In this paper, I have researched and developed solutions for optimizing the stream compaction algorithmic function using the Compute Unified Device Architecture (CUDA). The stream compaction is a common parallel primitive, an essential building block for many data processing algorithms, whose optimization improves the performance of a wide class of parallel algorithms useful in data processing....

2015
Yu Liu Yang Hong Chun-Yuan Lin Che-Lun Hung

The Smith-Waterman (SW) algorithm has been widely utilized for searching biological sequence databases in bioinformatics. Recently, several works have adopted the graphic card with Graphic Processing Units (GPUs) and their associated CUDA model to enhance the performance of SW computations. However, these works mainly focused on the protein database search by using the intertask parallelization...

2014
Vaibhav Tuteja

In this paper we discuss Image Encryption and Decryption using RSA Algorithm which was earlier used for text encryption. In today’s era it is a crucial concern that proper encryption decryption should be applied so that unauthorized access can be prevented. We intend to build a general RSA algorithm which can be combined with other image processing techniques to provide new methodologies and be...

2013
Muhammed Al-Mulhem Abdulah AlDhamin Raed Al-Shaikh

Parallel programming languages represent a common theme in the evolution of high performance computing (HPC) systems. There are several parallel programming languages that are directly associated with different HPC systems. In this paper, we compare the performance of three commonly used parallel programming languages, namely: OpenMP, MPI and CUDA. Our performance evaluation of these languages ...

2012
Ahmad Abdelfattah Jack J. Dongarra David E. Keyes Hatem Ltaief

Hardware accelerators are becoming ubiquitous high performance scientific computing. They are capable of delivering an unprecedented level of concurrent execution contexts. High-level programming language extensions (e.g., CUDA), profiling tools (e.g., PAPI-CUDA, CUDA Profiler) are paramount to improve productivity, while effectively exploiting the underlying hardware. We present an optimized n...

2012
Ahmad Abdelfattah Jack Dongarra David Keyes Hatem Ltaief

Hardware accelerators are becoming ubiquitous high performance scientific computing. They are capable of delivering an unprecedented level of concurrent execution contexts. High-level programming languages (e.g., CUDA), profiling tools (e.g., PAPI-CUDA, CUDA Profiler) are paramount to improve productivity, while effectively exploiting the underlying hardware. We present an optimized numerical k...

2014
Lauro Cássio Martins de Paula Anderson da Silva Soares

This paper presents a parallel implementation of the Hybrid Bi-Conjugate Gradient Stabilized (BiCGStab(2)) iterative method in a Graphics Processing Unit (GPU) for solution of large and sparse linear systems. This implementation uses the CUDA-Matlab integration, in which the method operations are performed in a GPU cores using Matlab built-in functions. The goal is to show that the exploitation...

2013
Xinbiao Gan Cong liu Zhiying Wang Li Shen Qi Zhu Jie Liu Lihua Chi Yihui Yan Bin Yu

Protein secondary structure prediction is very important for its molecular structure. GOR algorithm is one of the most successful computational methods and has been widely used as an efficient analysis tool to predict secondary structure from protein sequence. However, the running time is unbearable with sharp growth in protein database. Fortunately, CUDA (Compute Unified Device Architecture) p...

Journal: :Computers & Mathematics with Applications 2011
Christian Obrecht Frédéric Kuznik Bernard Tourancheau Jean-Jacques Roux

Emerging many-core processors, like CUDA capable nVidia GPUs, are promising platforms for regular parallel algorithms such as the Lattice Boltzmann Method (LBM). Since global memory on graphic devices shows high latency and LBM is data intensive, memory access pattern is an important issue to achieve good performances. Whenever possible, global memory loads and stores should be coalescent and a...

Journal: :CoRR 2013
Bogdan Oancea Tudorel Andrei Raluca Mariana Dragoescu

Parallel computing can offer an enormous advantage regarding the performance for very large applications in almost any field: scientific computing, computer vision, databases, data mining, and economics. GPUs are high performance many-core processors that can obtain very high FLOP rates. Since the first idea of using GPU for general purpose computing, things have evolved and now there are sever...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید