The Oyster River Protocol: A Multi Assembler and Kmer Approach For de novo Transcriptome Assembly

نویسنده

  • Matthew D. MacManes
چکیده

1 Characterizing transcriptomes in non-model organisms has resulted in a massive increase in our 2 understanding of biological phenomena. This boon, largely made possible via high-throughput sequencing, 3 means that studies of functional, evolutionary and population genomics are now being done by hundreds or 4 even thousands of labs around the world. For many, these studies begin with a de novo transcriptome 5 assembly, which is a technically complicated process involving several discrete steps. The Oyster River 6 Protocol (ORP), described here, implements a standardized and benchmarked set of bioinformatic processes, 7 resulting in an assembly with enhanced qualities over other standard assembly methods. Specifically, ORP 8 produced assemblies have higher TransRate scores and mapping rates, which is largely a product of the fact 9 that it leverages a multi-assembler and kmer assembly process, thereby bypassing the shortcomings of any 10 one approach. These improvements are important, as previously unassembled transcripts are included in 11 ORP assemblies, resulting in a significant enhancement of the power of downstream analysis. Further, as 12 part of this study, we show that assembly quality is unrelated to taxonomy, nor is it related to the number 13 of reads generated, above 30 million reads. Code Availability: The version controlled open-source code is 14 available at https://github.com/macmanes-lab/Oyster_River_Protocol. Instructions for software 15 installation and use, and other details are available at http://oyster-river-protocol.rtfd.org/. 16 Competing Interests 17 The author declares no competing interests. 18

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Informed kmer selection for de novo transcriptome assembly

MOTIVATION De novo transcriptome assembly is an integral part for many RNA-seq workflows. Common applications include sequencing of non-model organisms, cancer or meta transcriptomes. Most de novo transcriptome assemblers use the de Bruijn graph (DBG) as the underlying data structure. The quality of the assemblies produced by such assemblers is highly influenced by the exact word length k As su...

متن کامل

Clustering of Short Read Sequences for de novo Transcriptome Assembly

Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...

متن کامل

Establishing evidenced-based best practice for the de novo assembly and evaluation of transcriptomes from non-model organisms

Characterizing transcriptomes in both model and non-model organisms has resulted in a massive increase in 2 our understanding of biological phenomena. This boon, largely made possible via high-throughput 3 sequencing, means that studies of functional, evolutionary and population genomics are now being done by 4 hundreds or even thousands of labs around the world. For many, these studies begin w...

متن کامل

T-IDBA: A de novo Iterative de Bruijn Graph Assembler for Transcriptome

RNA sequencing based on next-generation sequencing technology is useful for analyzing transcriptomes, discovering novel genes and studying exon/intron structures. Similar to genome assembly, de novo transcriptome assembly does not rely on a reference genome and additional annotated information. Most, if not all, existing de novo transcriptome assemblers rely heavily on de novo genome assembly t...

متن کامل

T-IDBA: A de novo Iterative de Bruijn Graph Assembler for Transcriptome - (Extended Abstract)

RNA sequencing based on next-generation sequencing technology is useful for analyzing transcriptomes, discovering novel genes and studying exon/intron structures. Similar to genome assembly, de novo transcriptome assembly does not rely on a reference genome and additional annotated information. Most, if not all, existing de novo transcriptome assemblers rely heavily on de novo genome assembly t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017