Parallelizing the Smith-Waterman Local Alignment Algorithm using CUDA
نویسنده
چکیده
Given two strings S1 = pqaxabcstrqrtp and S2 = xyaxbacsl, the substrings axabcs in S1 and axbacs in S2 are very similar. The problem of finding similar substrings is the local alignment problem. Local alignment is extensively used in computational biology to find regions of similarity in different biological sequences. Similar genetic sequences are identified by computing the local alignment of a given sequence against a number of other genetic sequences. Protein molecules fold into unique 3-dimensional shapes. Different regions fold into various shapes – helices, sheets etc. These shapes determine the function of the proteins. Local alignment helps identify the various regions of structural similarity. BLAST and FASTA are two of the programs that compute the local alignment of a sequence against a database of other genetic sequences. Formally, given a scoring scheme that includes a cost for matching a pair of characters and inserting a character in one sequence (equivalently, introducing a gap in the other sequence), a local alignment of strings S1 and S2 is a pair of substrings s1 of S1 and s2 of S2 whose score is maximum over all possible substrings of S1 and S2 for the scoring scheme. Unlike the global alignment problem where the entire strings are to be matched, the local alignment problem identifies highly similar substrings. Also, unlike the edit distance problem, where the goal is to minimize the cost of transforming one sequence to another, the local alignment problem identifies highly similar substrings.
منابع مشابه
Fast Sequence Alignment Method Using CUDA-enabled GPU
Sequence alignment is a task that calculates the degree of similarity between two sequences. Given a query sequence, finding a database sequence which is most similar to the query by sequence alignment is the first step in bioinformatics research. The first sequence alignment algorithm was proposed by Needleman and Wunsch. They got the optimal global alignment by using dynamic programming metho...
متن کاملSW#–GPU-enabled exact alignments on genome scale
SUMMARY We propose SW#, a new CUDA graphical processor unit-enabled and memory-efficient implementation of dynamic programming algorithm, for local alignment. It can be used as either a stand-alone application or a library. Although there are other graphical processor unit implementations of the Smith-Waterman algorithm, SW# is the only one publicly available that can produce sequence alignment...
متن کاملGPU-Based Cloud Service for Smith-Waterman Algorithm Using Frequency Distance Filtration Scheme
As the conventional means of analyzing the similarity between a query sequence and database sequences, the Smith-Waterman algorithm is feasible for a database search owing to its high sensitivity. However, this algorithm is still quite time consuming. CUDA programming can improve computations efficiently by using the computational power of massive computing hardware as graphics processing units...
متن کاملGPU-SW Sequence Alignment server
We present a complete sequence homology search server based on the hybrid CPU/GPU implementation of the Smith Waterman algorithm for sequence alignment. We discuss system architecture, division of the tasks between CPU and GPU in the hybrid design, the scalability issues and hardware requirements. The performance of the server is compared with the state-ofthe-art sequence analysis servers. Bioi...
متن کاملParallel Smith-Waterman Algorithm for Gene Sequencing
Smith-Waterman Algorithm represents a highly robust and efficient parallel computing system development for biological gene sequence. The research work here gives a deep understanding and knowledge transfer about exiting approach for gene sequencing and alignment using Smith-waterman their strength and weaknesses. Smith-Waterman algorithm calculates the local alignment of two given sequences us...
متن کامل