Rascaf: Improving Genome Assembly with RNA Sequencing Data.

نویسندگان

  • Li Song
  • Dhruv S Shankar
  • Liliana Florea
چکیده

Abundant but short second-generation sequencing reads make assembly difficult, leading to fragmented genomes and gene annotations. Gene structure information from RNA sequences can be used to improve the completeness and contiguity of an assembly, but bioinformatics methods have been lacking. Rascaf is a highly efficient tool leveraging long-range continuity information from intron spanning RNA sequencing (RNA-seq) read pairs to detect new contig connections. It determines a heaviest path in an exon block graph that simultaneously represents a gene and the underlying contig relationships. Rascaf is more accurate than its competitors, highly precise, and finds thousands of new verifiable connections in several draft Rosaceae genomes. Lightweight and practical, it can be readily incorporated into sequencing pipelines to improve an assembly and its gene annotations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering of Short Read Sequences for de novo Transcriptome Assembly

Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...

متن کامل

Transcriptome Sequencing of Guilan Native Cow in Comparison with bosTau4 Reference Genome

RNA-sequencing is a new method of transcriptome characterization of organisms. Based on identity and relatedness, there are large genetic variations among different cattle breeds. The goal of the current study was to sequence the transcriptome of Guilan native cow and compare with available reference genome using RNA-sequencing method. Blood samples were collected from 14 Guilan native cows and...

متن کامل

Large Scale Identification of SSR Molecular Markers in Ajowan (Trachyspermum ammi) Using RNA Sequencing

The medicinal plant, Trachyspermum ammi is a rich source of active pharmaceutical ingredients with pharmaceutics effects. Microsatellite markers play a key role in the genome and gene expression, especially in secondary metabolite biosynthesis in medicinal plants. For the first time, transcriptome sequencing of this herb medicine was carried out to identify the microsatellite markers of this sp...

متن کامل

De Novo Assembly of a Bell Pepper Endornavirus Genome Sequence Using RNA Sequencing Data

The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data.

متن کامل

Improved hybrid de novo genome assembly of domesticated apple (Malus x domestica)

BACKGROUND Domesticated apple (Malus × domestica Borkh) is a popular temperate fruit with high nutrient levels and diverse flavors. In 2012, global apple production accounted for at least one tenth of all harvested fruits. A high-quality apple genome assembly is crucial for the selection and breeding of new cultivars. Currently, a single reference genome is available for apple, assembled from 1...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • The plant genome

دوره 9 3  شماره 

صفحات  -

تاریخ انتشار 2016