Rascaf: Improving Genome Assembly with RNA Sequencing Data.
نویسندگان
چکیده
Abundant but short second-generation sequencing reads make assembly difficult, leading to fragmented genomes and gene annotations. Gene structure information from RNA sequences can be used to improve the completeness and contiguity of an assembly, but bioinformatics methods have been lacking. Rascaf is a highly efficient tool leveraging long-range continuity information from intron spanning RNA sequencing (RNA-seq) read pairs to detect new contig connections. It determines a heaviest path in an exon block graph that simultaneously represents a gene and the underlying contig relationships. Rascaf is more accurate than its competitors, highly precise, and finds thousands of new verifiable connections in several draft Rosaceae genomes. Lightweight and practical, it can be readily incorporated into sequencing pipelines to improve an assembly and its gene annotations.
منابع مشابه
Clustering of Short Read Sequences for de novo Transcriptome Assembly
Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...
متن کاملTranscriptome Sequencing of Guilan Native Cow in Comparison with bosTau4 Reference Genome
RNA-sequencing is a new method of transcriptome characterization of organisms. Based on identity and relatedness, there are large genetic variations among different cattle breeds. The goal of the current study was to sequence the transcriptome of Guilan native cow and compare with available reference genome using RNA-sequencing method. Blood samples were collected from 14 Guilan native cows and...
متن کاملLarge Scale Identification of SSR Molecular Markers in Ajowan (Trachyspermum ammi) Using RNA Sequencing
The medicinal plant, Trachyspermum ammi is a rich source of active pharmaceutical ingredients with pharmaceutics effects. Microsatellite markers play a key role in the genome and gene expression, especially in secondary metabolite biosynthesis in medicinal plants. For the first time, transcriptome sequencing of this herb medicine was carried out to identify the microsatellite markers of this sp...
متن کاملDe Novo Assembly of a Bell Pepper Endornavirus Genome Sequence Using RNA Sequencing Data
The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data.
متن کاملImproved hybrid de novo genome assembly of domesticated apple (Malus x domestica)
BACKGROUND Domesticated apple (Malus × domestica Borkh) is a popular temperate fruit with high nutrient levels and diverse flavors. In 2012, global apple production accounted for at least one tenth of all harvested fruits. A high-quality apple genome assembly is crucial for the selection and breeding of new cultivars. Currently, a single reference genome is available for apple, assembled from 1...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- The plant genome
دوره 9 3 شماره
صفحات -
تاریخ انتشار 2016