De Novo Genome and Transcriptome Assembly of the Canadian Beaver (Castor canadensis)

نویسندگان

  • Si Lok
  • Tara A. Paton
  • Zhuozhi Wang
  • Gaganjot Kaur
  • Susan Walker
  • Ryan K. C. Yuen
  • Wilson W. L. Sung
  • Joseph Whitney
  • Janet A. Buchanan
  • Brett Trost
  • Naina Singh
  • Beverly Apresto
  • Nan Chen
  • Matthew Coole
  • Travis J. Dawson
  • Karen Ho
  • Zhizhou Hu
  • Sanjeev Pullenayegum
  • Kozue Samler
  • Arun Shipstone
  • Fiona Tsoi
  • Ting Wang
  • Sergio L. Pereira
  • Pirooz Rostami
  • Carol Ann Ryan
  • Amy Hin Yan Tong
  • Karen Ng
  • Yogi Sundaravadanam
  • Jared T. Simpson
  • Burton K. Lim
  • Mark D. Engstrom
  • Christopher J. Dutton
  • Kevin C. R. Kerr
  • Maria Franke
  • William Rapley
  • Richard F. Wintle
  • Stephen W. Scherer
چکیده

The Canadian beaver (Castor canadensis) is the largest indigenous rodent in North America. We report a draft annotated assembly of the beaver genome, the first for a large rodent and the first mammalian genome assembled directly from uncorrected and moderate coverage (< 30 ×) long reads generated by single-molecule sequencing. The genome size is 2.7 Gb estimated by k-mer analysis. We assembled the beaver genome using the new Canu assembler optimized for noisy reads. The resulting assembly was refined using Pilon supported by short reads (80 ×) and checked for accuracy by congruency against an independent short read assembly. We scaffolded the assembly using the exon-gene models derived from 9805 full-length open reading frames (FL-ORFs) constructed from the beaver leukocyte and muscle transcriptomes. The final assembly comprised 22,515 contigs with an N50 of 278,680 bp and an N50-scaffold of 317,558 bp. Maximum contig and scaffold lengths were 3.3 and 4.2 Mb, respectively, with a combined scaffold length representing 92% of the estimated genome size. The completeness and accuracy of the scaffold assembly was demonstrated by the precise exon placement for 91.1% of the 9805 assembled FL-ORFs and 83.1% of the BUSCO (Benchmarking Universal Single-Copy Orthologs) gene set used to assess the quality of genome assemblies. Well-represented were genes involved in dentition and enamel deposition, defining characteristics of rodents with which the beaver is well-endowed. The study provides insights for genome assembly and an important genomics resource for Castoridae and rodent evolutionary biology.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering of Short Read Sequences for de novo Transcriptome Assembly

Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...

متن کامل

Mites of the genus Schizocarpus Trouessart, 1896 (Acariformes: Chirodiscidae) from the North American beavers (Castor canadensis) in Russia.

Four native species of parasitic mites belonging to the genus Schizocarpus Trouessart, 1896 (Acariformes: Chirodiscidae) are recorded on the North American beaver Castor canadensis Kuhl, 1820 (Rodentia: Castoridae) from Russia. Totally ten beavers from all three main geographically isolated populations of in Russia (Leningrad Province, Voronezh Biosphere Reserve (beaver farm) and Khabarovsk Ter...

متن کامل

Seasonal differences in the testicular transcriptome profile of free-living European beavers (Castor fiber L.) determined by the RNA-Seq method

The European beaver (Castor fiber L.) is an important free-living rodent that inhabits Eurasian temperate forests. Beavers are often referred to as ecosystem engineers because they create or change existing habitats, enhance biodiversity and prepare the environment for diverse plant and animal species. Beavers are protected in most European Union countries, but their genomic background remains ...

متن کامل

T-IDBA: A de novo Iterative de Bruijn Graph Assembler for Transcriptome

RNA sequencing based on next-generation sequencing technology is useful for analyzing transcriptomes, discovering novel genes and studying exon/intron structures. Similar to genome assembly, de novo transcriptome assembly does not rely on a reference genome and additional annotated information. Most, if not all, existing de novo transcriptome assemblers rely heavily on de novo genome assembly t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2017