UTR Reconstruction and Analysis Using Genomically Aligned EST Sequences

نویسندگان

  • Zhengyan Kan
  • Warren Gish
  • Eric C. Rouchka
  • Jarret Glasscock
  • David J. States
چکیده

Untranslated regions (UTR) play important roles in the posttranscriptional regulation of mRNA processing. There is a wealth of UTR-related information to be mined from the rapidly accumulating EST collections. A computational tool, UTR-extender, has been developed to infer UTR sequences from genomically aligned ESTs. It can completely and accurately reconstruct 72% of the 3' UTRs and 15% of the 5' UTRs when tested using 908 functionally cloned transcripts. In addition, it predicts extensions for 11% of the 5' UTRs and 28% of the 3' UTRs. These extension regions are validated by examining splicing frequencies and conservation levels. We also developed a method called polyadenylation site scan (PASS) to precisely map polyadenylation sites in human genomic sequences. A PASS analysis of 908 genic regions estimates that 40-50% of human genes undergo alternative polyadenylation. Using EST redundancy to assess expression levels, we also find that genes with short 3' UTRs tend to be highly expressed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

P-215: Discovery of A Novel APA Variant of A Human Potential Gene Based on Expressed Sequenced Tags Analysis

Background: Expressed sequence tags (ESTs) are sequences of cDNA fragments prepared from different tissue sources. There are over one million of these sequences in the publicly available database, and these sequences are believed to represent more than half of all human genes. The ESTs belong to different cDNA libraries, was prepared from one particular cell type, organ, or tumor. Therefore, th...

متن کامل

In silico prediction of UTR repeats using clustered EST data

Clustering of EST data is a method for the non-redundant representation of an organisms transcriptome. During clustering of large amounts of EST data, usually some large clusters (>500 sequences) are created. Those can lead to iterative contig builds, consumation of lots of computing time and improbable exon alignments, which is unfavourable. In addition, these clusters sometimes contain transc...

متن کامل

In silico analysis of EST and genomic sequences allowed the prediction of cis-regulatory elements for Entamoeba histolytica mRNA polyadenylation

In most eukaryotic cells, the poly(A) tail at the 3'-end of messenger RNA (mRNA) is essential for nuclear export, translatability, stability and transcription termination. Poly(A) tail formation involves multi-protein complexes that interact with specific sequences in 3'-untranslated region (3'-UTR) of precursor mRNA (pre-mRNA). Here we have performed a computational analysis of a large EST and...

متن کامل

Selecting for functional alternative splices in ESTs.

The expressed sequence tag (EST) collection in dbEST provides an extensive resource for detecting alternative splicing on a genomic scale. Using genomically aligned ESTs, a computational tool (TAP) was used to identify alternative splice patterns for 6400 known human genes from the RefSeq database. With sufficient EST coverage, one or more alternatively spliced forms could be detected for nearl...

متن کامل

Cytoplasmic poly(A) binding protein-1 binds to genomically encoded sequences within mammalian mRNAs.

The functions of the major mammalian cytoplasmic poly(A) binding protein, PABPC1, have been characterized predominantly in the context of its binding to the 3' poly(A) tails of mRNAs. These interactions play important roles in post-transcriptional gene regulation by enhancing translation and mRNA stability. Here, we performed transcriptome-wide CLIP-seq analysis to identify additional PABPC1 bi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings. International Conference on Intelligent Systems for Molecular Biology

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2000