Local sequence and structure features in long RNAs

نویسندگان

  • Hoor K. Al-Hasani
  • Rolf Backofen
  • Christian Schindelhauer
  • Steffen Heyen
چکیده

This work proposes an automatic method to analyze different data sets based on their features. The features of human lincRNA were compared to those of mRNA. The features include sequences compositions, structures, and base-pairs probabilities. Afterward, a generic Background Model (BGM) was computed, which fitted the values of these features to a selection of distributions. The fitted distribution is used as a basis to determine cutoff values for significant regions within the sequences for different significant levels. With the help of the BGM, significant regions of lincRNA sequences could be highlighted and compared between different features. Moreover, it was used to compare the similarity of the feature distributions in lincRNA with those of the different parts of the mRNA, namely the 5’UTR, the coding sequence, and the 3’UTR. Finally, each step in our approach was visualized in automatic graphs to enable an easier identification of regions of interest or to manually check the computed results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phylogenetic Analysis of Three Long Non-coding RNA Genes: AK082072, AK043754 and AK082467

Now, it is clear that protein is just one of the most functional products produced by the eukaryotic genome. Indeed, a major part of the human genome is transcribed to non-coding sequences than to the coding sequence of the protein. In this study, we selected three long non-coding RNAs namely AK082072, AK043754 and AK082467 which show brain expression and local region conservation among vertebr...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

Circular RNA: features, functions and their correlation with diseases especially cancer

In early 2012, the world of science saw a fascinating discovery called circular RNA as a transcription product of thousands of genes in mice and humans. These circular RNAs have recently been grouped as the encoding RNA in an independent group that their remarkable difference with other RNAs is that these RNAs are not linear, in which two ends connect with a covalent connection creating a loop-...

متن کامل

Computational Identification of Micro RNAs and Their Transcript Target(s) in Field Mustard (Brassica rapa L.)

Background: Micro RNAs (miRNAs) are a pivotal part of non-protein-coding endogenous small RNA molecules that regulate the genes involved in plant growth and development, and respond to biotic and abiotic environmental stresses posttranscriptionally.Objective: In the present study, we report the results of a systemic search for identifi cation of new miRNAs in B. rapa using homology-based ...

متن کامل

Primary and secondary structures of chicken, rat and man nuclear U4 RNAs. Homologies with U1 and U5 RNAs.

U4 RNA from chicken, rat and man was examined for nucleotide sequence and secondary structure. Three molecular species, U4A, U4B and U4C were detected in the three animal species. U4A is 146 nucleotide long and U4B RNA only lacks the 3' terminal G. four nucleotides are missing at the 3'-end of U4C RNA which, in addition, differs from U4A and U4B RNAs at two internal positions. Thus, U4C RNA is ...

متن کامل

Long non-coding RNAs and their significance in human diseases

Protein-coding genes account for only a small fraction of the human genome and most of the genomic sequences are transcriptionally silent, but recent observations indicate significant functional elements, including non-coding protein transcripts in the human genome. Long non-coding RNAs (lncRNAs) have been defined as transcripts of >200 nucleotides without protein-coding capacity that perform t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011