Concurrent and Accurate Short Read Mapping on Multicore Platforms

نویسندگان

  • Héctor Martínez
  • Joaquín Tárraga
  • Ignacio Medina
  • Sergio Barrachina
  • Maribel Castillo
  • Joaquín Dopazo
  • Enrique S. Quintana-Ortí
چکیده

In this paper we introduce a novel parallel work-flow-based aligner for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, named HPG Aligner W1, leverages the speed of the BurrowsWheeler Transform to map a large number of RNA fragments (reads) rapidly, as well as the accuracy of the Smith-Waterman algorithm to deal with conflictive reads. The aligner is complemented with a careful strategy to detect splice junctions based on the division of RNA reads into small segments (or seeds), which are then mapped onto a number of candidate alignment locations, providing useful information for the successful alignment of the complete reads. Experimental results on a platform with AMD multicore technology report the parallel performance of HPG Aligner W, on RNA reads of 100–400 nucleotides, which excels in execution time/sensitivity to state-of-the-art aligners such as TopHat 2+Bowtie 2 and MapSplice, and compares favorably to STAR.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Concurrent and Accurate RNA Sequencing on Multicore Platforms

In this paper we introduce a novel parallel pipeline for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, named HPG-aligner, leverages the speed of the Burrows-Wheeler Transform to map a large number of RNA fragments (reads) rapidly, as well as the accuracy of the Smith-Waterman algorithm, that is employed to deal with conflictive reads. Th...

متن کامل

Design Flow for GPU and Multicore Execution of Dynamic Dataflow Programs

Dataflow programming has received increasing attention in the age of multicore and heterogeneous computing. Modular and concurrent dataflow program descriptions enable highly automated approaches for design space exploration, optimization and deployment of applications. A great advance in dataflow programming has been the recent introduction of the RVC-CAL language. Having been standardized by ...

متن کامل

Viral population analysis and minority-variant detection using short read next-generation sequencing

RNA viruses within infected individuals exist as a population of evolutionary-related variants. Owing to evolutionary change affecting the constitution of this population, the frequency and/or occurrence of individual viral variants can show marked or subtle fluctuations. Since the development of massively parallel sequencing platforms, such viral populations can now be investigated to unpreced...

متن کامل

MacroDB: Scaling Database Engines on Multicores

Multicore processors are available for over a decade, but general purpose database management systems (DBMS) still cannot fully explore the computational resources of these platforms. This paper explores a simple and easy to deploy approach for improving DBMS performance in multicore platforms, by maintaining multiple database engines running in parallel, rather than a single instance, thus cir...

متن کامل

Multicore vs Manycore: The Energy Cost of Concurrency

In this paper, we study the relation between performance and energy in concurrent programs. As energy efficiency became a key challenge of the computing industry, it is crucial to seek solutions that achieve high performance at a reasonable carbon footprint. We show, however, that energy is dramatically impacted by concurrency and it remains difficult to predict the energy consumed even when th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013