Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences [version 2; referees: 2 approved]

نویسندگان

  • Stephen N. Floor
  • Rob Patro
  • Charlotte Soneson
  • Michael I. Love
  • Mark D. Robinson
چکیده

High-throughput sequencing of cDNA (RNA-seq) is used extensively to characterize the transcriptome of cells. Many transcriptomic studies aim at comparing either abundance levels or the transcriptome composition between given conditions, and as a first step, the sequencing reads must be used as the basis for abundance quantification of transcriptomic features of interest, such as genes or transcripts. Various quantification approaches have been proposed, ranging from simple counting of reads that overlap given genomic regions to more complex estimation of underlying transcript abundances. In this paper, we show that gene-level abundance estimates and statistical inference offer advantages over transcript-level analyses, in terms of performance and interpretability. We also illustrate that the presence of differential isoform usage can lead to inflated false discovery rates in differential gene expression analyses on simple count matrices but that this can be addressed by incorporating offsets derived from transcript-level abundance estimates. We also show that the problem is relatively minor in several real data sets. Finally, we provide an R package ( tximport) to help users integrate transcript-level abundance estimates from common quantification pipelines into count-based statistical inference engines. DOI: 10.12688/f1000research.7563.2 Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-124742 Published Version Originally published at: Soneson, Charlotte; Love, Michael I; Robinson, Mark D (2015). Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Research, 4:1521. DOI: 10.12688/f1000research.7563.2

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences

High-throughput sequencing of cDNA (RNA-seq) is used extensively to characterize the transcriptome of cells. Many transcriptomic studies aim at comparing either abundance levels or the transcriptome composition between given conditions, and as a first step, the sequencing reads must be used as the basis for abundance quantification of transcriptomic features of interest, such as genes or transc...

متن کامل

Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences Supplementary Material

The sim2 data set consists of simulated sequencing reads from the human chromosome 1. The sequencing parameters as well as underlying TPM values for the 15,677 transcripts in one of the two simulated conditions were estimated using RSEM v1.2.21 [6] from the ERS326990 sample from the ArrayExpress data set with accession number E MTAB 1733. We simulated three biological replicates from each of tw...

متن کامل

DRIMSeq: a Dirichlet-multinomial framework for multivariate count outcomes in genomics [version 2; referees: 2 approved]

There are many instances in genomics data analyses where measurements are made on a multivariate response. For example, alternative splicing can lead to multiple expressed isoforms from the same primary transcript. There are situations where differences (e.g. between normal and disease state) in the relative ratio of expressed isoforms may have significant phenotypic consequences or lead to pro...

متن کامل

Bayesian estimation of differential transcript usage from RNA-seq data.

Next generation sequencing allows the identification of genes consisting of differentially expressed transcripts, a term which usually refers to changes in the overall expression level. A specific type of differential expression is differential transcript usage (DTU) and targets changes in the relative within gene expression of a transcript. The contribution of this paper is to: (a) extend the ...

متن کامل

Gene-level differential analysis at transcript-level resolution

Gene-level differential expression analysis based on RNA-Seq is more robust, powerful and biologically actionable than transcript-level differential analysis. However aggregation of transcript counts prior to analysis results can mask transcript-level dynamics. We demonstrate that aggregating the results of transcript-level analysis allow for gene-level analysis with transcript-level resolution...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017