Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses.

نویسندگان

  • Moran N Cabili
  • Cole Trapnell
  • Loyal Goff
  • Magdalena Koziol
  • Barbara Tazon-Vega
  • Aviv Regev
  • John L Rinn
چکیده

Large intergenic noncoding RNAs (lincRNAs) are emerging as key regulators of diverse cellular processes. Determining the function of individual lincRNAs remains a challenge. Recent advances in RNA sequencing (RNA-seq) and computational methods allow for an unprecedented analysis of such transcripts. Here, we present an integrative approach to define a reference catalog of >8000 human lincRNAs. Our catalog unifies previously existing annotation sources with transcripts we assembled from RNA-seq data collected from ∼4 billion RNA-seq reads across 24 tissues and cell types. We characterize each lincRNA by a panorama of >30 properties, including sequence, structural, transcriptional, and orthology features. We found that lincRNA expression is strikingly tissue-specific compared with coding genes, and that lincRNAs are typically coexpressed with their neighboring genes, albeit to an extent similar to that of pairs of neighboring protein-coding genes. We distinguish an additional subset of transcripts that have high evolutionary conservation but may include short ORFs and may serve as either lincRNAs or small peptides. Our integrated, comprehensive, yet conservative reference catalog of human lincRNAs reveals the global properties of lincRNAs and will facilitate experimental studies and further functional classification of these genes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linkage between Large intergenic non-coding RNA regulator of reprogramming and Stemness State in Samples with Helicobacter pylori Infection of Gastric Cancer Cells

Background: Long noncoding RNAs (lncRNAs), as non-protein coding transcripts, play key roles in tumor progression and stemness state in many malignancies, as their aberrant expression has been found in gastric cancer (GC) as one of the most common cancer worldwide. LINC-ROR (large intergenic noncoding RNA regulator of reprogramming) identified as an involved lncRNA in human malignancies, howeve...

متن کامل

Pervasive Transcription of the Human Genome Produces Thousands of Previously Unidentified Long Intergenic Noncoding RNAs

Known protein coding gene exons compose less than 3% of the human genome. The remaining 97% is largely uncharted territory, with only a small fraction characterized. The recent observation of transcription in this intergenic territory has stimulated debate about the extent of intergenic transcription and whether these intergenic RNAs are functional. Here we directly observed with a large set of...

متن کامل

Comprehensive Characterization of 10,571 Mouse Large Intergenic Noncoding RNAs from Whole Transcriptome Sequencing

Large intergenic noncoding RNAs (lincRNAs) have been recognized in recent years to constitute a significant portion of the mammalian transcriptome, yet their biological functions remain largely elusive. This is partly due to an incomplete annotation of tissue-specific lincRNAs in essential model organisms, particularly in mice, which has hindered the genetic annotation and functional characteri...

متن کامل

Re-annotation of presumed noncoding disease/trait-associated genetic variants by integrative analyses

Using RefSeq annotations, most disease/trait-associated genetic variants identified by genome-wide association studies (GWAS) appear to be located within intronic or intergenic regions, which makes it difficult to interpret their functions. We reassessed GWAS-Associated single-nucleotide polymorphisms (herein termed as GASs) for their potential functionalities using integrative approaches. 8834...

متن کامل

lncRNome: a comprehensive knowledgebase of human long noncoding RNAs

The advent of high-throughput genome scale technologies has enabled us to unravel a large amount of the previously unknown transcriptionally active regions of the genome. Recent genome-wide studies have provided annotations of a large repertoire of various classes of noncoding transcripts. Long noncoding RNAs (lncRNAs) form a major proportion of these novel annotated noncoding transcripts, and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genes & development

دوره 25 18  شماره 

صفحات  -

تاریخ انتشار 2011